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Abstract. Under the key assumption of finite p-variation, p £ [1,2), of the covariance of the 
underlying Gaussian process, sharp a.s. convergence rates for approximations of Gaussian rough 
paths are established. When applied to Brownian resp. fractional Brownian motion (fBM), p = 1 
resp. p = 1/ (2H), we recover and extend the respective results of [Hu-Nualart; Rough path 
analysis via fractional calculus; TAMS 361 (2009) 2689-2718] and [Deya-Neuenkirch-Tindel; A 
Milstein-type scheme without Levy area terms for SDEs driven by fractional Brownian motion; 
AIHP (2011)]. In particular, we establish an a.s. rate k~ (Vp-V 2 — e ) t any e > 0, for Wong- 
Zakai and Milstcin-type approximations with mesh-size 1/fc. When applied to fBM this answers 
a conjecture in the afore-mentioned references. 



1. Introduction 

Recall that rough path theory [17l[19j|9] is a general framework that allows to establish existence, 
uniqueness and stability of differential equations driven by multi-dimensional continuous signals 
x: [0, T] — > K d of low regularity. Formally, a rough differential equation (RDE) is of the form 

d 

(1.1) dy t — Vi (Vt) dx\ = V (y t ) dx t ; y € R e 

i=i 

where (Vi) i=1 d is a family of vector fields in R e . When x has finite p-variation, p < 2, such 
differential equations can be handled by Young integration theory. Of course, this point of view 
does not allow to handle differential equations driven by Brownian motion, indeed 

\B ti+1 — B t . | = +oo a.s., 

leave alone differential equations driven by stochastic processes with less sample path regularity 
than Brownian motion (such as fractional Brownian motion (fBM) with Hurst parameter H < 1/2). 
Lyons' key insight was that low regularity of x, say p-variation or l/p-H61der for some p € [l,oo), 
can be compensated by including "enough" higher order information of x such as all increments 

(1.2) x" t = / dx tl <8> ■ ■ ■ <E> dx tn 

J S<tx< — <t n <t 

(1.3) = Yl ( I dx t\ ■ ■ ■ dx t':) e ii ® • • • ® ^ g (R d f n 

i<i U ...,i n <d\ Js<t i<~< t «< t n ' 
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where "enough" means n < [p] ({ei, . . . ,ed} denotes just the usual Euclidean basis in E d here). 
Subject to some generalized p- variation (or l/p-Holder) regularity, the ensemble (x 1 , . . . ,x' p l) then 
constitutes what is known as a rough pathQ In particular, no higher order information is necessary 
in the Young case; whereas the regime relevant for Brownian motion requires second order - or level 2 



- information ("Levy's area"), and so on. Note that the iterated integral on the r.h.s. of (1.2) is not - 
in general - a well-defined Riemann-Stieltjes integral. Instead one typically proceeds by mollification 

- given a multi-dimensional sample path x = X (u>) , consider piecewise linear approximations or 
convolution with a smooth kernel, compute the iterated integrals and then pass, if possible, to a 
limit in probability. Following this strategy one can often construct a "canonical" enhancement of 
some stochastic process to a (random) rough path. Stochastic integration and differential equations 
are then discussed in a (rough) pathwise fashion; even in the complete absence of a semi-martingale 
structure. 

It should be emphasized that rough path theory was - from the very beginning - closely related 
to higher order Euler schemes. Let D = {0 = to < . . . < = 1} be a partition of the unit 

interval]^] Considering the solution y of ( |1.1[ ), the step- TV Euler approximation y Eulcr - D is given by 

Euler™ ;L> 
Vo = 2/0 

Euler™;!) Euler";!) . T r / Eider" siA j . , , Tr ( Euler™ ' ;D\ ii.i 2 

ftj+i = Vta +Vi[y ti J xJ J)tj+1 +V il V i2 [y t . J^/fy+i 

t • • • -t- *%\ • • • v in-i v in yt/tj j x *j,t; + i 

at the points tj € D where we use the Einstein summation convention, Vj stands for the differential 
operator Vi@xk an d 3 4^'"'* n = J s<t < ... <t <t dxt\ ■ ■ ■ ^t"- An extension of the work of A.M. 

Davie (cf. [I], [S]) shows that the step- AT Euler schem^] for an RDE driven by a 1/p- Holder 
rough path with step size 1/k (i.e. D = Dk = {i : j = 0, . . . , k\) and A > [p] will converge with 

rateO(i) (W+1)/p - 1 . Of course, in a probabilistic context, simulation of the iterated (stochastic) 
integrals x™. t is not an easy matter. A natural simplification of the step- A Euler scheme thus 
amounts to replace in each step 

{x? jiti+1 :n€ {1,.., A}} « {A (x^. +l )®" : n 6 {1, . . . ,N} 

which leads to the simplified step-A Euler scheme 

sEulcr™;!) 

VO = 2/0 

sEuler™;!) _ sBuler N ;£> , rr ( sEulcr N ;,D\ j i_ —V V. ( sEulcrlV \ D \ h «2 

Vtj+i ~ Vtj ' v i [Vtj j x i i + i 2 ix i2 J X *3i t j + i X *j.*j + i 

1 ,, T . / sEulcr N ;D\ i 1 

rVi, . . . Vi„_, Vi„ hj t . ) x. 1 



basic theorem of rough path theory asserts that further iterated integrals up to any level N > [p], i.e. 

Sjv(x):=(x":n6{l,...,iV}) 

are then deterministically determined and the map x i— > Sjsr (x), known as Lyons lift, is continuous in rough path 
metrics. 

2 A general time horizon [0, T] is handled by trivial reparametrization of time. 
... which one would call Milstein scheme when N = 2 ... 
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Since x|. t . +1 — X tj j j+1 (w) = X t]+1 (u>) — X tj (uj) this is precisely the effect in replacing the 
underlying sample path segment of X by its piecewise linear approximation, i.e. 

t-U 



{X t {u) :te [t 3 ,t 3+l ]} 



t 



- t 



■X tjtj+1 (u):t€ 



Therefore, as pointed out in [3] in the level N = 2 Holder rough path context, it is immediate that 
a Wong-Zakai type result, i.e. a.s. convergence of — > y for k — >• oo where j/ fc ) solves 



dy[ k) = v (,f )) 



(fc). 



(fc) 

y = e 



and is the piecewise linear approximation of x at the points (ij) _ = Dki i- c 
(fci t—t 



x u 



U 



leads to the convergence of the simplified (and implementable!) step-iV Euler scheme. 

While Wong-Zakai type results in rough path metrics are available for large classes of stochastic 
processes [9] Chapter 13, 14, 15, 16] our focus here is on Gaussian processes which can be enhanced 
to rough paths. This problem was first discussed in [3] where it was shown in particular that 
piecewise linear approximation to fBM are convergent in p-variation rough path metric if and only 
if H > 1/4. A practical (and essentially sharp) structural condition for the covariance, namely finite 
p- variation based on rectangular increments for some p < 2 of the underlying Gaussian process was 
given in [S] and allowed for a unified and detailed analysis of the resulting class of Gaussian rough 
paths. This framework has since proven useful in a variety of different applications ranging from 
non-Markovian Hormander theory [2] to non-linear PDEs perturbed by space-time white-noise [12] . 
Of course, fractional Brownian motion can also be handled in this framework (for H > 1/4) and we 
shall make no attempt to survey its numerous applications in engineering, finance and other fields. 

Before describing our main result, let us recall in more detail some aspects of Gaussian rough path 
theory (e.g. [5], Chapter 15], [IH])- The basic object is a centred, continuous Gaussian process 
with sample paths X (oj) — (X 1 (oj) X d (u>)j : [0, 1] — > M. d where X 1 and X? are independent 
for i 7^ j. The law of this process is determined by R x : [0, 1] -> R dxd , the covariance function, 
given by 

R x (s, t) = diag (E (X]XI) ,...,E (X d X d )) . 

We need 

Definition 1. Let f = f (s,t) be a function from [0, l] 2 into a normed space; for s < t,u < v we 
define rectangular increments as 



f 



s, t 

U. V 



f(t,v)-f(t,u)-f(s,v) + f(s,u). 



For p > 1 we then set 



V P (f, [s,t] x [u,v]) = 



( 

sup 

DC[s,t] U&D 
\DC[u,v] f. g £) 



VP 



J 



where the supremum is taken over all partitions D and D of the intervals [s,t] resp. 
Vpif, [0: 1] 2 ) < oo we say that f has finite (2D) p-variation. 



If 
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The main result in this context (see e.g. [HI Theorem 15.33], [TU]) now asserts that if there 



exists p < 2 such that V p [Rx, [0, 1] 2 J < oo then X lifts to an enhanced Gaussian process X with 

sample paths in the p- variation rough path space C°' p - var ([0, 1] , (M. d )), any p E (2p, 4). (This 
and other notations are introduced in section 2.) This lift is "natural" in the sense that for a 
large class of smooth approximations X^ of X (say piecewise linear, mollifier, Karhunen-Loeve) 
the corresponding iterated integrals of X^ converge (in probability) to X with respect to the p- 
variation rough path metric. (We recall from [9] that p p _ var , the so-called inhomogeneous p- variation 
metric for G N (R rf ) -valued paths, is called p- variation rough path metric when [p] — N; the Ito- 
Lyons map enjoys local Lipschitz regularity in this p-variation rough path metric.) Moreover, this 
condition is sharp; indeed fBM falls into this framework with p = 1/ (2H) and we known that 
piecewise-linear approximations to Levy's area diverge when H — 1/4. 

Our main result (cf. Theorem [5]), when applied to (mesh-size 1/fc) piecewise linear approxima- 
tions X^ of X, reads as follows. 

Theorem 1. Let X = (X 1 , . . .,X d ) : [0, 1] -> R d be a centred Gaussian process on a probability 
space (Ct, J- ' , P) with continuous sample paths where X 1 and X 3 are independent for i ^ j . Assume 
that the covariance Rx has finite p-variation for p G [1, 2) and K > V p (^Rx, [0, l] 2 ^ • Then there 
is an enhanced Gaussian process X with sample paths a.s. in 

c o, P -var ([ 0j l] ,G*[ p ] (R d )) for any 

p G (2p,4) ande 



Pp-var (S\p\ [x {k) ^j ,x) 



-> 



for k — > oo and every r > 1 (\-\ Lr denotes just the usual L r (P)-norm for real valued random 
variables here). Moreover, for any 7 > p such that - + - > 1 and any q > 27 and N € N there is 
a constant C — C (q, p, 7, K, N) such that 

L 2 



Pq-var (&N ^ (fe) ) ,S N (X) 



0<t<l 



< Cr N ' 2 sup X[ k) - X t 



holds for every fceN. 



As an immediate consequence we obtain (essentially) sharp a.s. convergence rates for Wong-Zakai 
approximations and the simplified step-3 Euler scheme. 

Corollary 1. Consider a RDE with C°° -bounded vector fields driven by a Gaussian Holder rough 
path X. Then mesh-size 1/k Wong-Zakai approximations (i.e. solutions of ODEs driven by X^) 
converge uniformly with a.s. rate /c~( 1 /' > ~ 1 / 2_£ ) ; any e > 0, to the RDE solution. The same rate is 
valid for the simplified (and implementable) step-3 Euler scheme. 

Proof. See Corollary [8] and Corollary [9| □ 



Several remarks are in order. 

• Rough path analysis usually dictates that N = 2 (resp. N = 3) levels need to be considered 
when p £ [1,3/2) resp. p <E [3/2,2). Interestingly, the situation for the Wong-Zakai error is 
quite different here - referring to Theorem [l] when p = 1 we can and will take 7 arbitrarily 
large in order to obtain the optimal convergence rate. Since p g _ var is a rough path metric 
only in the case N = [q] > [27], we see that we need to consider all levels N which is 
what Theorem [T] allows us to do. On the other hand, as p approaches 2, there is not so 
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much room left for taking 7 > p. Even so, we can always find 7 with [7] = 2 such that 
I/7 + > 1. Picking q > 2j small enough shows that we need N = [q] = 4. 

• The assumption of C°°-boundcd vector fields in the corollary was for simplicity only. In 
the proof we employ local Lipschitz continuity of the Ito-Lyons map for g-variation rough 
paths (involving TV = [q] levels). As is well-known, this requires Lip 9+£ -regularity of the 
vector fieldiQ Curiously again, we need C°° -bounded vector fields when p = 1 but only 
Lip 4+e as p approaches the critical value 2. 

• Brownian motion falls in this framework with p = 1. While the a.s. (Wong-Zakai) rate 
fc^ 1 / 2 -^) is part of the folklore of the subject (e.g. [TT]) the C°°-boundedness assumption 
appears unnecessarily strong. Our explanation here is that our rates are universal (i.e. valid 
away from one universal null-set, not dependent on starting points, coefficients etc). In par- 
ticular, the (Wong-Zakai) rates are valid on the level of stochastic flows of diffeomorphisms; 
we previously discussed these issues in the Brownian context in [7]. 

• A surprising aspect appears in the proof of theorem [T] The strategy is to give sharp 
estimates for the levels n = 1, . . . , 4 first, then performing an induction similar to the one 
used in Lyon's Extension Theorem ([UJ) for the higher levels. This is in contrast to the 
usual considerations of level 1 to 3 only (without level 4!) which is typical for Gaussian 
rough paths. (Recall that we deal with Gaussian processes which have sample paths of 
finite p- variation, p £ (2p, 4), hence [p] < 3 which indicates that we would need to control 
the first 3 levels only before using the Extension Theorem.) 

• Although theorem [I] was stated here for (step-size 1 /k) piecewise linear approximations 
{Jf( fe )}, the estimate holds in great generality for (Gaussian) approximations whose covari- 
ance satisfies a uniform p- variation bound. The statements of Theorem [5] and Theorem [6] 
reflect this generality. 

• Wong-Zakai rates for the Brownian rough path (level 2) were first discussed in [14]. They 
prove that Wong-Zakai approximations converge (in 7-Holder metric) with rate /c - ( 1 / 2- '>' -6 ) 
(in fact, a logarithmic sharpening thereof without e) provided 7 6 (1/3, 1/2). This restric- 
tion on 7 is serious (for they fully rely on "level 2" rough path theory); in particular, the 
best "uniform" Wong-Zakai convergence rate implied is fc^ 1 / 2 - 1 / 3 -^) = fc-(V 6 - e ) leaving 
a significant gap to the well-known Brownian a.s. Wong-Zakai rate. 

• Wong-Zakai (and Milstein) rates for the fractional Brownian rough path (level 2 only, Hurst 
parameter H > 1/3) were first discussed in [5]. They prove that Wong-Zakai approximations 
converge (in 7-Hblder metric) with rate k~^ H (again, in fact, a logarithmic sharpening 
thereof without e) provided 7 <E (1/3, H). Again, the restriction on 7 is serious and the best 
"uniform" Wong-Zakai convergence rate - and the resulting rate for the Milstein scheme 
- is k"~( H ~ 1 / 3 '~ e \ This should be compared to the rate fc^ 2 -^- 1 / 2 -^) obtained from our 
corollary. In fact, this rate was conjectured in [5] and is sharp as may be seen from a 
precise result concerning Levy's stochastic area for fBM, see [217] , 

The remainder of the article is structured as follows: In Section [2] we repeat the basic notions 
of (Gaussian) rough paths theory. Section [3] recalls the connection between the shuffle algebra 
and iterated integrals. In particular, we will use the shuffle structure to see that in order to 
show the desired estimates, we can concentrate on some iterated integrals which somehow generate 
all the others. Our main tool for showing L 2 estimates on the lower levels is multidimensional 
Young integration which we present in Section [4] The main work, namely showing the desired 



4 ...in the sense of E. Stein; cf. |19l 151 for instance. 
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L 2 -estimates for the difference of high-order iterated integrals, is done in Section [5J After some 



preliminary Lemmas in Subsection 5.1 wc show the estimates for the lower levels, namely for 
n = 1,2,3,4 in Subsection |5.2| , then give an induction argument in Subsection |5.3| for the higher 
levels n > 4. Section [6] contains our main result, namely sharp a.s. convergence rates for a class of 
Wong-Zakai approximations, including piecewise-linear and mollifier approximations. We further 
show in Subsection |6.3| how to use these results in order to obtain sharp convergence rates for the 
simplified Euler scheme. 

2. Notations and basic definitions 

For iVeNwe define 

T N (R d ) =R®R l! ®(R £i 8R>...ffi (R d )® N = ©£L (R rf )®" 

and write n n : T N (E d ) — > (R d )®" for the projection on the n-th Tensor level. It is clear that 
is a (finite-dimensional) vector space. For elements g,h € T N (R d ), we define g <g> h 6 
T N (R d ) by 

71 

TTn (g O h) = ^ Kn-i (g) ® TTi (h) . 

i=0 

One can easily check that (T N (R d ) , +, ®) is an associative algebra with unit element 1 = exp (0) = 
1 + + 0+ .. . + . We call it the truncated tensor algebra of level N. A norm is defined by 



max 



lTN ( Rd ) „=0,...,iV 



kn (5)1 



which turns T N (R d ^j into a Banach space. 
For s < t, we define 



which is the n-simplex on the square [s, t] n . We will use A = Ag 1 for the 2-simplex over [0, l] 2 . 
A continuous map x : A — > T N (M. ) is called multiplicative functional if for all s < u < t one has 
x Sji = x S!tl (X)X u t .For a path x = (x 1 , . . . , x d ) : [0,1] — > E d and s < t, we will use the notation 
x s ,t — Xt — x s . If x has finite variation, we define its n-th iterated integral by 



dx ® ... ® dx 

A'* 



l<i 1 ....,i n <d J A ",t 



dx 11 . . . dx in e it ® . . . ® e in G (R ) 



where {ei, . . . , e^} denotes the Euclidean basis in IR'' and (s, t) € A. The canonical lift Sat (a;) : A 
T N (R d ) is defined by 



7r„ ^5jv (a;) 5 



x» t if ne{l,...,A} 
1 if n = 0. 



It is well know (as a consequence of Chen's theorem) that Sn (x) is a multiplicative functional. 
Actually, one can show that SV (x) takes values in the smaller set G N (R d ) C T N (M d ) defined by 



G 



N 



(R d ) = [S N (x) 0A : x e C 1 -™ 1 - ([0, 1] , M d ) } 
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which is still a group with <8>. If x, y : A — > T N (E d ) are multiplicative functionals and p > 1 we set 



( X J) : = max sup V x£ ■ t - y£ t 

n=l,...,N {u)e[QA] \ 



, \ n/p 
p/n\ 



This generalizes the p-variation distance induced by the usual p- variation semi-norm 

\ Vp 



Mp-„«r;[a,t] = SU P E K+i ~ X ** 
\(*i)C[»,t]V 



for paths x: [0, 1] — > M d . The Lie group G w admits a natural norm ||-||, called the Carnot- 
Caratheodory norm (cf. Chapter 7]). If x : A — >• G N (R d ), we set 



H x llp-« a r;[ fl ,t] = SU P E' L ' ' 



v (i«)C[s,t]' . 

Definition 2. 77ie space 

c o, P -var ]_] ^ G w j s de/med as i/j e se t of continuous paths x: A -> 

/or which there exists a sequence of smooth paths Xk '■ [0, 1] — >• R d such that p p _ var (x, S^v (asjfe)) 
/or /c — ^ oo . 7/iV = [p] = max{n e N : rt < p} we caH t/izs i/ie space of (geometric) p-rough paths. 

It is clear by definition that every p-rough path is also a multiplicative functional. By Lyon's 
First Theorem (or Extension Theorem, see jTTJ, Theorem 2.2.1] or [5J Theorem 9.5]) every p-rough 
path x has a unique lift to a path in G N (E d ) for N > [p]. We denote this lift by Sjv(x) and call it 
the Lyons lift. For a p-rough path x, we will also use the notation 



x "t = Tn (S N (x) s (t ) 



for N > n. Note that this is consistent with our former definition in the case where x had finite 
variation. We will always use small letters for paths x and capital letters for stochastic processes 
X. The same notation introduced here will also be used for stochastic processes. 

Definition 3. A function lj: A — > R + is called a (ID) control if it is continuous and superadditive, 
i.e. if for all s < u < t one has 

LJ (s, u) + LJ (it, t) < LJ (s, t) . 

If x: [0, 1] — > W 1 is a continuous path with finite p-variation, one can show that 

{ S ,t)^V p {x,[ S ,t]f := \ x \ P p- V ar;[ S ,t] 

is continuous and superadditive, hence defines a ID-control function. Unfortunately, this is not the 
case for higher dimensions. Recall Definition [l] If /: [0, l] 2 — > K has finite p-variation, 

(s, t) , (u, v) ^ V p (/, [s, t] x [u, v]) p 

in general fails to be superadditive (cf. [1Q]). Therefore, we will need a second definition. If 

A = [s,t] x [u,v] is a rectangle in [0, l] 2 , we will use the notation f (A) '■= f ( )• We call 

two rectangles essentially disjoint if their intersection is empty or degenerate. A partition II of a 
rectangle R C [0, l] 2 is a finite set of essentially disjoint rectangles whose union is R. The family of 
all such partitions is denoted by V (R). 
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Definition 4. A function uj : A x A — > R + is called a (2D) control if it is continuous, zero on 
degenerate rectangles and super-additive in the sense that for all rectangles R C [0, 1] , 

n 

J £uj(R i )<u;(R) 

i=l 

whenever {Ri : i = l,...,n} G V(R). uj is called symmetric ifuj([s,t] x [it,v]) — uj ([u,v] x [s,t]) 
holds for all s < t and u < v. If f : [0, 1] — > B is a continuous function, we say that its p- variation 
is controlled by uj if \f (R)\ p < uj (R) holds for all rectangles R C [0, l] 2 . 

It is easy to see that if uj is a 2D control, (s,t) n- uj (js,t] 2 ^j defines a ID-control. 
Definition 5. For f: [0, 1] ->• R, R C [0, l] 2 a rectangle and p > 1 we 



nev(R) \ Aen j 



If \ f\ p -var-[o i] 2 < oo we say that f has finite controlled p-variation. 

The difference of 2D p-variation introduced in Definition [l] and controlled p-variation is that 
in the former, one only takes the supremum over grid-like partitions whereas in the latter, one 
takes the supremum over all partitions of the rectangle. By superadditivity, the existence of a 
control uj which controls the p-variation of / implies that / has finite controlled p-variation and 
\f\p-var-R — w (R) 1 ^ ' ■ I n this case, we can always assume w.l.o.g. that uj is symmetric, otherwise 
we just substitute to by its symmetrization ui sym given by 

W sym ([S,t] X [u,v]) = UJ ([s,t] X [U,V]) +U)([u,v] X [s,t]) . 

The connection between finite variation and finite controlled p-variation is summarized in the fol- 
lowing theorem. 

Theorem 2. Let f: [0, l] 2 — > K be continuous and R C [0, l] 2 be a rectangle. 

(1) We have 

Vi(f,R) = \f\ 1 _ var , R . 

(2) For any p > 1 and e > there is a constant C = C (p, e) such that 

~Q \f\(p+e)-var;R — Vp-var (f,R) < \ f\ p - va r;R ' 

(3) // / has finite controlled p-variation, then 

R ^ \ ffp-var-R 

is a 2D-control. In particular, there exists a 2D-control uj such that for all rectangles 
R C [0, l] 2 we have \ f (R)\ p < uj (R), i.e. uj controls the p-variation of f . 

Proof. 10, Theorem 1]. □ 

In the following, unless mentioned otherwise, X will always be a Gaussian process as in Theorem 
[T] and X denotes the natural Gaussian rough path. We will need the following Proposition: 
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Proposition 1. Let X be as in Theorem^ and assume that u) controls the p-variation of the 
covariance of X , p € [1, 2). Then for every n € N there is a constant C (n) = C {n, p) such that 



X? 



s,t\L 2 



< c 



(n)u([s,t] 2 ) 



for any s < t. 



Proof. For n — 1,2,3 this is proven in [SJ Proposition 15.28]. For n > 4 and fixed s < t, we set 
X T : = — :r X s+T ( t _ s < j . Then \Rx\ P _ var .\ Q 11 < 1 ='■ K and by the standard (deterministic) 

w([s,i] 2 ) 2 p ^ var.y , J 

estimates for the Lyons lift, 



X 



n I 1 /™ 



2\ 2 P 



— < Ci 



5, 



(*) 



p— uar;[0,l] 



< c 2 (n,p) 



X 



p— uar;[0,l] 



for any p € (2p, 4). Now we take the L 2 -norm on both sides. From [SI Theorem 15.33] we know 



p— var;[0,l] 



is bounded by a constant only depending on p, p and K which shows the 



L 2 



that 
claim. 

Alternatively (and more in the spirit of the forthcoming arguments), one performs an induction 
similar (but easier) as in the proof of Proposition [8] □ 



3. Iterated integrals and the shuffle algebra 

Let d ) : [0, 1] — >■ M d be a path of finite variation. Forming finite linear combinations 

of iterated integrals of the form 

dx 11 . . . dx ln , ii, . . . ,i n € {1, . . . ,d} ,n € N 

0,1 

defines a vector space over K. In this section, we will see that this vector space is also an algebra 
where the product is given simply by taking the usual multiplication. Moreover, we will describe 
precisely how the product of two iterated integrals looks like. 

3.1. The shuffle algebra. Let A be a set which we will call from now on the alphabet. In the 
following, we will only consider the finite alphabet A — {a, b, . . .} = {a\, 02, . . . , a^} = {1, . . . , d}. 
We denote by A* the set of words composed by the letters of A, hence w — a^a^ . . . aj„, cij, G A. 
The empty word is denoted by e. A + is the set of non-empty words. The length of the word is 
denoted by \w\ and \w\ denotes the number of occurrences of the letter a. We denote by R (A) 
the vector space of noncommutative polynomials on A over K, hence every P £ R (A) is a linear 
combination of words in A* with coefficients in E. (P, w) denotes the coefficient in P of the word 
w. Hence every polynomial P can be written as 

P = ^ ( P : W ) W 

and the sum is finite since the (P, w) are non-zero only for a finite set of words w. We define the 
degree of P as 

deg(P) = max{H ; (P, w) ^ 0} . 



L 
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A polynomial is called homogeneous if all monomials have the same degree. We want to define a 
product on R (A). Since a polynomial is determined by its coefficients on each word, we can define 
the product PQ of P and Q by 

(PQ,w)= ]T (P,u)(Q,v). 

w—uv 

Note that this definition coincides with the usual multiplication in a (noncommutative) polynomial 
ring. We call this product the concatenation product and the algebra K (A) endowed with this 
product the concatenation algebra. 

There is another product on K (A) which will be of special interest for us. We need some 
notation first. Given a word w — a^a^ . . . a in and a subsequence U — {31,32, ■ ■ ■ ,3k) of (ii, . . . , i n ), 
we denote by w(U) the word aj 1 aj 2 . . . aj k and we call w(U) a subword of w. If w, u, v are words 

and if w has length n, we denote by ^ ^ v ^ ^he nurmDer 01 subsequences U of (1, . . . , n) such 

that w(U) = u and w(U c ) = v. 

Definition 6. The (homogeneous) polynomial 



u*v= 

w£A* 



w 

U V 



is called the shuffle product of u and v. By linearity we extend it to a product on R (A). 

In order to proof our main result, we want to use some sort of induction over the length of the 
words. Therefore, the following definition will be useful. 

Definition 7. If U is a set of words of the same length, we call a subset {u>i, . . . ,uik} of U a 
generating set for U if for every word w e U there is a polynomial R and real numbers Ai, . . . , Afe 
such that 

k 

w = ^ ^jWj + R 
where R is of the form R = J2 U veA+ Vu.v 11 * v f or rea ^ numbers fi u ^ v . 

Definition 8. We say that a word w is composed by a™ 1 , . . . , a a d d if w e {a^, . . . , a^}* and \w\ a , = 
rii for i — 1, . . . , d, hence every letter appears in the word with the given multiplicity. 

The aim now is to find a (possibly small) generating set for the set of all words composed by 
some given letters. The next definition introduces a special class of words which will be important 
for us. 

Definition 9. Let A be totally ordered and put on A* the alphabetical order. If w is a word such 
that whenever w — uv for u, v G A + one has u < v, then w is called a Lyndon word. 

Proposition 2. (1) For the set {words composed by a, a, b} a generating set is given by {aab}. 

(2) For the set {words composed by a,a,a,b} a generating set is given by {aaab}. 

(3) For the set {words composed by a,a,b,b} a generating set is given by {aabb}. 

(4) For the set {words composed by a, a, b, c} a generating set is given by {aabc, aacb, baac}. 

Proof. Consider the alphabet A = {a,b,c}. We choose the order a < b < c. A general theorem 
states that every word w has a unique decreasing factorization into Lyndon words, i.e. w — l^ . . . V£ 
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where li > . . . > Ik are Lyndon words and i\, . . . ,ik > 1 (see [3TJ Theorem 5.1 and Corollary 4.7]), 
and the formula 

. , 1 . J T 1 l*k" = w + V a u u 

ii\...i k \ k f-' 



holds, where a u are some natural integers (see again |21l Theorem 6.1]). By repeatedly applying 
this formula for the words in the sum on the right hand side, it follows that a generating set for 
each of the sets in (1) to (4) is given exactly by the Lyndon words composed by these letters. 
One can easily show that indeed aab, aaab and aabb are the only Lyndon words composed by 
the corresponding letters. The Lyndon words composed by a, a, b, c are {aabc, abac, aacb} which 
therefore is a generating set for {words composed by a, a, b, c}. From the shuffle identity 

abac = baac + aabc + aacb — b * aac 

it follows that also {aabc, aacb, baac} generates this set. □ 

3.2. The connection to iterated integrals. Let x — (x 1 , . . . , x d ) : [0, 1] — > R d be a path of finite 
variation and fix s < t G [0,1]. For a word w — (a^ . . . <2j n ) G A* , A = {1, . . . , d} we define 

f f A „ dx {l . . . dx in if w £ A+ 
1 if w = e 

Let (K (A) , +, *) be the shuffle algebra over the alphabet A. We define a map $ : R (A) ->• R by 
$ (w) = x™ t and extend it linearly to polynomials P g R (A). The key observation is the following: 

Theorem 3. $ is an algebra homomorphism from the shuffle algebra (R (A) ,+,*) to (R, +,-)• 

Proof [H], Corollary 3.5. □ 

The next proposition shows that we can restrict ourselves in showing the desired estimates only 
for the iterated integrals which generate the others. 

Proposition 3. Let (X,Y) = (X 1 , Y 1 , . . . ,X , Y"J be a Gaussian process on [0,1] with paths 
of finite variation. Let A — {l,...,d} be the alphabet, let U be a set of words of length n and 
V = {w\ > . . . , Wk} be a generating set for U . Let uj be a control, p, 7 > 1 constants and s < t G [0, 1] . 
Assume that there are constants C = C i\w\) such that 

\Xy, t \ L2 <C(\ W \)cj(s,t)^ and \ Y£| £a < C (\w\) u (s, t)^ 
holds for every word w G A* with \w\ < n — 1. Assume also that for some e > 

|X" t - Y» \ L2 < C (H) ^ (s, t)*u> (s, t) 1 ^ 1 

holds for every word w with \w\ < n — 1 and w G V . Then there is a constant C which depends on 
the constants C , on n and on d such that 



holds for every w G U . 



Remark 1. We could account for the factor u (s,t) 2 ~* in e here but the present form is how we 
shall use this proposition later on. 
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Proof. Consider a copy A of A. If a G A, we denote by a the corresponding letter in A. If w = 
Oij . . . di n G A* , we define iu = . . . flj n € ^4* and in the same way we define P € K (A) for P G 
K (A). Now we consider K (AOA) equipped with the usual shuffle product. Define ^ : K (ALL4) -> K 
by 



* (w) = / dZ b 'i . . . dZ b 



for a word w = b^ . . . bi n where 



bj _ f X a > for 6, = dj 
\ ) " for bj = Oj 

and extend this definition linearly. By Theorem |3j we know that ^> is an algebra homomorphism. 
Take w G U. By assumption, we know that there is a vector A = (Ai, . . . , A&) such that 

k 

w — w = Xj (wj — wj) + R — R 
j'=i 

where R is of the form R = veA + \ u \+\ v \= n Mu,u u * v with real numbers [i u v - Applying $ and 
taking the L 2 norm yields 



|X- t -Y-| L2 < ^|A,||X^-Y^| L2 + |*(P-P) 



L 2 



1=1 



< c 1 euj(s,t) 2 -' w(M)" 2p + |*(P-P)| 



Now, 



P - P = ^ ^ («*«-u*i;) = ^ /i u ^ (u - it) * v + /i Ui „ti *(v-v). 

u.v u,v 

Applying ^ and taking the L 2 norm gives then 

I* (P - P) | l2 < \^u,v \ | (X" it - Y" f ) Xg t | L2 + I I |Y" t (^s,t _ Y^t) | i2 
<r \ , „ / Iy" _ v u I iTf 11 I i |v" I Iy° _ ~v v I 

E l H + M-i 

c 3 ew (s,^) 2 ^ w (s,t) 2 " 

< c 4 ew (s,t)^ u; (M)^ 



where we used equivalence of L 9 -norms in the Wiener Chaos (cf. [3J Proposition 15.19 and Theorem 
D.8]). Putting all together shows the assertion. □ 
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4. Multidimensional Young-integration and grid-controls 

Let /: [0, 1]™ — > K be a continuous function. If s\ < ti, . . . , s n < t n and u\, . . . ,u n are elements 
in [0, 1], we make the following recursive definition: 



U2 



/ *1 \ 



\ u n J 

( \ 

Sfe-1, tk-1 
Wfe+1 



We will also use the simpler notation 



«2 



/ S 1 \ 



-f 



U 2 



and 



( s\M \ ( si,ti \ 



Sfe-i,ife-i 



/ 



Sk 



-i,tk- 

Sk 
Uk+1 



\ U n J \ U n / 



f(R) = .f 



for the rectangle J? = [si, t\] x . . . x [s„,£„] C [0, 1]". Note that for n = 2 this is consistent with our 



initial definition of / 



S U tl 

Young-integral is defined by 



. If f,g: [0, 1]™ — » R are continuous functions, the n-dimcnsional 



f(x 1 ,...,x n ) dg{xi,...,x n ) 



si,ti]x...x[s„,t n ] 



lim V /(*!,,•••>*" )<7 



r ii' r ii+l 



(*. 1 i) CDl V *?„>*?„+! 

(*?„)<= A. 

if this limit exists. Take p > 1. The n-dimensional p- variation of / is defined by 



V p (f, [si,ii] x ... x [s„,i„]) 



sup ^ 



i'"*i+i 



*?„.*?„ 



+i 



fl«C[s„,t„] 
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and if V p (/, [0, 1]") < oo we say that / has finite (n-dimensional) p-variation. The fundamental 
theorem is the following: 

Theorem 4. Assume that f has finite p-variation and g finite q-variation where ^ + A > 1. Then 
the joint Young-integral below exists and there is a constant C = C (p, q) such that 

( Sl,«l 

/ 



[si,ti] X ... X [a„,t„] 



dg (ui, ...,u n ) 



< CV P (/, [si,*i] x ... x [s n ,t n ])V g (g, [si,<i] x ... x [s n ,t n ]) . 

Proof. [22], Theorem 1.2 (c). □ 

We will mainly consider the case n = 2, but we will also need n — 3 and 4 later on. In particular, 
the discussion of level n = 4 will require us to work with AD grid control functions which we now 
introduce. With no extra complication we make the following general definition. 

Definition 10 (n-dimensional grid control). A map Q: A x . . . x A — > R + is called a n-D grid- 

n-times 

control if it is continuous and partially super-additive, i.e. for all (si, t±) , . . . , (s n ,t n ) £ A and 
Si < Ui < ti we have 

uj ([si,ti] x ... x [suUi] x ... x [s n ,t n ]) +oj([si,ti] x ... x [ui,U] x ... x [s n ,t n ]) 
< oj ([si,ti] x ... x [suti] x ... x [s n ,t n ]) 
for every i = 1, . . . , n. uj is called symmetric if 

U)([si,ti] X ... X [s n ,t n ]) = U) ([s CT (l),* CT (l)] X ... X [s CT (n))*cr(n)]) 
holds for every a G S n . 

The point of this definition is that \f (A)\ p < Q (A) for every rectangle A C [0, l] n implies that 
V p (/, R) p < uj (R) for every rectangle R C [0, 1]". Note that a 2D control in the sense of Definition 
4] is automatically a 2D grid-control. The following immediate properties will be used in Section 
5.2.31 with m = n = 2. 

Lemma 1. (1) The restriction of a (m + n) -dimensional grid-control to m arguments is a Tri- 
dimensional grid- control. 
(2) The product of a m- and a n-dimensional grid-control is a (m + n)- dimensional grid-control. 

4.1. Iterated 2Z?-integrals. In the 1-dimensional case, the classical Young-theory allows to define 
iterated integrals of functions with finite p-variation where p < 2. There, the superadditivity of 
(s,i) i — ^ Hp_„ ar .[ s t] played an essential role. We will see that Theorem |2j can be used to define and 
estimate iterated 2Z?-integrals. This will play an important role in Section [5] when we estimate the 
L 2 -norm of iterated integrals of Gaussian processes. 

Lemma 2. Let f,g: [0, l] 2 — > K be continuous where f has finite p-variation and g finite controlled 
q-variation with p^ 1 + q^ 1 > 1. Let (s,t) £ A and assume that f (s, •) = /(-,s) = 0. Define 
[s,t] 2 -> E by 



$(«,«)=/ / dg 

J \s,u] X [s,v] 



RATES 

Then there is a constant C = C (p, q) such that 

Vq-var (*; [S, t] 2 ) < C (p, q) V v - var (/; [fl, t] 2 ) |0lg- rar;[s ,t]2 • 

Proof. (1) Let i; < and t, < tj + \. Then, 

UiU+i \ _ f 
tj,tj + ij J, t 
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' [t»,*i+l]x [tj^j + i 

Now let ti < u < t i+ i and tj<v< tj + \. Then one has 

'ti,U 



fdg. 



j ( ;j = / («, v)-f(u,v)-f («, + / (t^ t,-) . 



Therefore, 





< 


\tj,tj + lj 





r - /If") f/ ' /, "- r) 

[t 4 ,t 4+ i]x %,t J+ i V-J' 67 



+ 



/(*»,") dg(u,v) 



t i ,t i+1 ]x\t j ,t j+1 \ 



[t i ,ti+i]x\i j ,i j+1 \ 



f (u,tj) dg(u,v) 



[n,t i+1 ]x 



f(ti,ij) dg(u,v) 



For the first integral we use Young 2D-estimates to see that 



r- - ( /"''•) 

u +1 ]x[t j ,t j+1 ] \h' u 



< a (p, q) V p (/, [U, t i+1 ] x [tj , t j+1 ] ) V 9 (5, [U, t i+1 ] x [fj , t j+1 ] ) 

< ci (p, (?) F p 

— i;ar;[t«,tj+i]x [*j,tj + i] 

For the second, one has by a Young ID-estimate 



[h,t i+1 ]x [i j ,i j+1 \ 



Similarly, 



f{U,v) dg{u,v) 



I _ f (ti,v) d(g(t i+1 ,v) - g(ti,v)) 
J \tj,tj +1 ] 



< 



c 2 sup \f {u,-)\ p _ var . [sA \g\ q _ 



u£[s,t] 



r;[t i ,t i+1 ]x[tj < t :j+1 ] ■ 



/ / (u,tj) dg(u,v] 

J[ti,ti+i]x[ij'h + i] 



<c 2 sup |/(-,«)|p_„ or;[ait] |fl|,_„ or . [tlit ]x r f f i. 



Finally, 



[t 4 ,t 4+ i]x [ij,t j+ A 



f (ti,tj) dg(u,v) 



— \f {ti, tj 



ti, ti-\-i 
tj, tj+l 



— l/loo;[s,t] If \q— var;[ti,t i+1 ]x + 
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Putting all together, we get 

$ ( h^j+A " 
\tj,tj +1 J 

< C 3 \V p (f,[ S ,t] j+ Sup |/(«, ■)!„_„„;[.,*] + SUp \f(-,v)\ p 

—var;[s,t] ' \J \oo;[s,t] 

\ ue[s,t] vE[s,t] 

X \q\ q , , r - - 1 ■ 
'q-var;lti,ti+i]x[tj ,tj+i\ 



Take a partition D C [s, i] and u e [s, i]. Then 

^ i/(M i+ i)-/(Mi)i p = E / 

t 4 eD t,er> 
and hence 



s, u 



< Vr 



p(l M 2 ) 



sup |/(u,-)| 
ue[s,t] 



-var;[s,t] — P 



The same way one obtains 



ve[s,t] 

Finally, for u, v G [s, t], 



l/(-.«)l P - W ari[,,t]<^(/.[*.*] 2 )- 



1/ («,«)! = 



S, It 

s, u 



< Vr 



p(/,M 2 ) 



and therefore l/l^.^ t] —Vp (/> I s ' ^ 2 ) ■ Putting everything together, we end up with 



^i+1 
tj, tj+i 



< 



c*V p (f, [s,t] 2 ) 9 \g\ q q _ 



■vor; [ti X kj ,tj 



Hence for every partition D,_D C [s,t] one gets, using superadditivity of |#|g_ vor , 



E 



^2 5 ^l+l 



< C4 t/ P (/,[ S ,t] 2 ) 9 53 | 5 | 



< 



(/,M 2 ) l< 

Passing to the supremum over all partitions shows the assertion. 



□ 



This lemma allows us to define iterated 2D- integrals. Let f,gi,...,g n : [0, l] 2 — > K. An iterated 
2D-integral is given by / A i tXA i ( /dffi = J[ S)t ] x [ a - )t /] ^5i («,") for n = 1 and recursively 

defined by 



I 



fdg x ... dg, 



A" t xA™ 



•/[s.tlx[s'.t'l WZ> 



[s,t]x[s',t'] UA'.'xA" 1 



fdgx . ..dg n -i dg n (u,v) 



for n > 2. 
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Proposition 4. Let f, gi, g2, ■ ■ . : [0, l] 2 — > K andp, q\, q2, ■ ■ ■ be real numbers such that p^ 1 +q± 1 > 
1 and qf + > 1 /or every i > 1. Assume that f has finite p-variation and gi has finite qi- 
variation for i = 1,2,... and £/ia£ /or (s, i) e A we Ziave / (s, •) = /(•, s) = 0. TTien /or every n 6 N 
£/iere is a constant C = C {p, q\, . . . , q n ) such that 



L 



fdgi... dg r , 

<A" 



< 



Proof. Define (it, v) = J A „ xA „ / dgi . . . dg n . We will show a stronger result; namely that for 
every n € N and > q n there is a constant C = C (p, q\, . . . , q n , q' n ) such that 

V q , n [ fl ,tf ) < CU P (/, [s,t] 2 ) F 91 (, 9l , M 2 ) -..V q „ (<?„, [s,tfj . 



To do so, let qi,q~2, ■ ■ -be a sequence of real numbers such that g 7 > q 7 and ~^ 1- 4- > 1 for every 

j = 1,2,... where we set <Jo = P- We make an induction over n. For n = 1, we have (?i > o/i and 
- + A- > 1, hence from Theorem |2j we know that g\ has finite controlled ^-variation and Lemma 
[2] gives us 

V qi [s,t] 2 ) < ciV p (/; M 2 ) ISi|<j i;M = < ^ (/; [s,tfj V fll (<h; [*,tf) • 

W.l.o.g, we may assume that q[ > qi > qi, otherwise we choose q± smaller in the beginning. From 
V q ' ± [M] 2 ) < V qi [s,i] 2 ) the assertion follows for n = 1. Now take nei Note that 

and clearly (j)^ 1 ) (s, •) = (. ; s) — 0. We can use Lemma [2] again to see that 

V 5n ($(»), [ S ,f ) < cs^^ (V"" 1 ); [a, if) |5n| g -„_, ar;M2 
< CiV^ (V"" 1 ); [s,t] 2 ) V qn (<?„; M 2 ) . 

Using our induction hypothesis shows the result for q n . By choosing q n smaller in the beginning if 
necessary, we may assume that q' n > q n and the assertion follows. □ 

5. The main estimates 

In the following section, (X, Y) = (X 1 , Y 1 , . . . , X d , Y d ) will always denote a centred continuous 
Gaussian process where (X 1 , Y l ) and , Y J ) are independent for i 7^ j. We will also assume that 
the p-variation of R(xy) is finite for a p < 2 and controlled by a symmetric 2Z?-control uj (this 
in particular implies that the p-variation of Rx,Ry & n d Rx-y is controlled by u, see [51 Section 
15.3.2]). Let 7 > p such that - + - > 1. The aim of this section is to show that for every n e N 
there are constants C (n) such thatj^] 

1 n-l 

(5.1) |X£ t - Yl t \ L2{m ® n) <C(n)eu ([s,t] 2 ) * u ([s,t] 2 ) 2p for every a < t 

^We prefer to write it in this notation instead of writing uj ^[s,t] 2 ^ 2t 2p to emphasize the different roles of 
the two terms. The first term will play no particular role and just comes from interpolation whereas the second one 
will be crucial when doing the induction step from lower to higher levels in Proposition [8j 
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where e = I Rx-y, [s, t] ) (see Definition 



11 



below for the exact definition of V^»). Equiva- 



lently, we might show (5.1 ) coordinate- wise, i.e. proving that the same estimate holds for \~K W — Y W \ L2 
for every word w formed by the alphabet A = {1, . . . , d}. In some special cases, i.e. if a word w has 
a very simple structure, we can do this directly using multidimensional Young integration. This is 
done in Subsection 5.1 Subsection 5.2 shows (5.1 1 for n = 1, 2, 3, 4 coordinate-wise, using the shuf- 
fle algebra structure for iterated integrals and multidimensional Young integration. In Subsection 



5.3 we show (5.1| coordinate-free for all n > 4, using an induction argument very similar to the 



one Lyon's used for proving the Extension Theorem (cf. [17J). 

We start with giving a 2-dimensional analogue for the one-dimensional interpolation inequality. 



Definition 11. Iff: [0,1]" 
A x A we set 



B is a continuous function in a Banach space and (s, t) x (it, v) G 
V 00 (f,[s,t]x[u,v])= sup |/(A)|. 

Ac[s,t]x [u,v] 



Lemma 3. For 7 > p > 1 we have the interpolation inequality 

V 7 . var (/, M X [«,«]) < Vco (/, M X [u,v\i~ Ph V P ~var (/, [a,t] X [u,v])" h 

for all (s, t) , (it, v) G A. 

Proof. Exactly as ID-interpolation, see [U Proposition 5.5]. □ 

K is a process with smooth sample paths, we will use the 



5.1. Some special cases. If Z : [0, 1 

notation 



for s < t. 



dZ ...dZ 



A" 



Lemma 4. Let X : [0, 1] — > K 6e a centred Gaussian process with continuous paths of finite variation 
and assume that the p-variation of the covariance Rx is controlled by a ID-control w. For fixed 
s < t, define 

/(u,») = e(xWxW' 

Then there is a constant C = C {p, n) such that 



V, 



(/,M 2 ) <Cu([a,tf)' 



Proof. Let ^ < tj+i, £j < tj+i. Then 



/ 



= £7 



((xS +1 -xW)(x^ +i -x 



(n) 



We know that X^ = ^T-. From the identity 



.06" 



we deduce that 



tj, tj + l 



b n -a n = (b-a) (a 71 - 1 + a n ~ 2 b + ...- 

(x tai+1 x^ +i (x s , u+i y i ^- k (x sM ) k ( a\, . : )" ' ' (.v., ) '") 



n-l 



(n\) 2 ^ 



fc,Z=0 
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We want to apply Wick's formula now (cf. HHI Theorem 1.28]). EZ,Ze [x StU+1 , 
we know that 



E 



\E(X UM+1 Z)\ P < u{[ti,ti+i]x[s,t]) 

(x U}U+ ,x ijtij+i )\ p < u ([t i ,u+i]x[i j ,i j+1 ]) 

(M 2 ) 



E ZZ 



and the same holds for Xf. Now take two partitions D,D 6 [0, 1]. Then, by Wick's formula 
and the estimates above, 



f 



; ^i+1 

tj, tj+1 



j < ci (p, n) u) {[s, t] J 2^ w([tj,t i+ i] x [s ) t])w([t i ,t #+ i] x [s,t]) 

tieDjjeD 

+c 2 (p, n) W ([s, i] J 2^ w X [tj,t j+ i]) 



ti£D,t<£D 



< C 3 CJ 



□ 



Lemma 5. Lei (X, Y) be a centred Gaussian process in M 2 with continuous paths of finite variation. 
Assume that the p-variation of R<x,Y) * s controlled by a 2D-control lu for p < 2 and take 7 > p. 
Then for every n € N there is a constant C — C (n) such that 



yW _ v(«) 

A s,t 1 s,t 



< 



;j _C(n)o,(M 2 )^M 2 ) " 

/ 2\ 1 ~ P/ ' 7 

for any s < t where e — Voo I Rx-Y, [s, t) J 

Proof. By induction. For n = lwe simply have from Lemma [3] 

\X.,t - Ys,t\ 2 L 2 = E K x s,t - Y s , t ) (X„ tt - Y. tf )] < Vy- wr (Rx-y, [s, tfj 



< e 2 V p - var (Rx-yAsA 2 )"' 1 < ([s,t] 2 y 



For n £ N we use the identity 



•jj-(n) _ y(n) _ 



(n-1) 



(n-1) 



and hence 



y(") _ vW 



L 2 



< Ci 



(l*a,t 



MIL 2 



< C26W 



([., if) ^ «([*,*]') 



L 2 

V— 1 
2/> 



+ 



X 



(n-1) v (n-l) 



yit l2 \y s ,\ l2 



a 
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Assume that [Z 1 , Z 2 ) is a centred, continuous Gaussian process in IR 2 with smooth sample paths 
and that both components are independent. Then (at least formally, cf. [5]), 



(5.2) 
(5.3) 





2 








/ Zq u dZ 2 


= E 




= E 


/ Z u Z dZ 2 dZ 2 


Jo 


L 2 






J[0,l] 2 



I E [Zl tU Zl v ] dE [Z 2 U Z 2 ] = [ R 
J[o,i] 2 J[o,i] 2 



■ 
■ 



where the integrals in the second row are 2D Young-integrals (to make this rigorous, one uses that 
the integrals are a.s. limits of Riemann sums and that a.s. convergence implies convergence in L 1 
in the (inhomogeneous) Wiener chaos). These kinds of computations together with our estimates 
for 2D Young-integrals will be heavily used from now on. 

Lemma 6. Let (X,Y) = (X 1 , Y 1 , . . . ,X ,Y J be a centred Gaussian process with continuous 
paths of finite variation where [X l , Y^J and (X^ , Y-'J are independent for i ^ j. Assume that the 
p-variation of R(x,y) * s controlled by a 2D-control uj for p < 2. Let w be a word of the form w = 
i\ ■ ■ ■ i n where i\, . . . ,i n <E {1, . . . , d} are all distinct. Take 7 > p such that - + - > 1. Then there 
is a constant C — C (p, 7, n) such that 



■ S,t\ L 2 



< 



1 

C(n) ew([s,i] 2 ) 2T u Ua,tf 



2 \ 2 p 



for any s <t where e 2 = Voo (Rx-Yi [s,t]' 
Proof. By the triangle inequality, 



l^s,* 1 s,t\ L 2 



dX l 



l-p/l 



.dX^ 



A" 



dY l 



. dY %% 



L 2 



< 



E 

fc=i 



dY n . . . dY 1 **- 1 d(X lk - Y lk ) dX ik + x 



.dX 1 



L 2 



From independence, Proposition [4] and Lemma [3] 



dY 11 . . . dY 1 "- 1 d X Zk - Y Zk ) dX 1 ^ 1 . . . dX l 



L 2 



dR Y i! ■ ■ ■ dR Y i k _ x dR X i k _ Y i k dR x i k+1 . . . dR 



i X i 



A" ,xA" 



< Cl V p (r yH , [s, i] 2 ) ...V p (i? yi *-i » [«, t} 2 ) v 1 (r 
*V P (R x i k+ i , [s, t} 2 ) ...V p (R xin , [a, f) 

< c 1 y 7 (i?x-y ! [s,t] 2 ) W ([ S ,t] 2 ) P < Cl e 2 uj {[s,t] 2 y u {[s,t] 2 ^j 



The first inequality above is an immediate generalization of the calculations made in (5.2 1 and (5.3 1. 
Note that the respective random terms are not only pairwise but mutually independent here since 
we are dealing with a Gaussian process (X, Y). Interchanging the limits is allowed since convergence 
in probability implies convergence in L p , any p > 0, in the Wiener chaos. □ 



RATES 



21 



5.2. Lower levels. 

5.2.1. n = 1,2. 

Proposition 5. Let (X,Y), lu, p and 7 as in Lemma^ Then there are constants C(1),C(2) 
which depend on p and 7 such that 

\Xlt - Yl t \ L2 < C (n) a* ([a, t} 2 ) * u {[s, t} 2 ) V 
holds for n = 1, 2 and every (s, t) G A where e = I Rx-y, [s, t] ) 

Proof. The coordinate- wise estimates are just special cases of Lemma [5] and Lemma [6j □ 

5.2.2. n = 3. 

Proposition 6. Let (X, Y), us, p and 7 as in Lemma^ Then there is a constant C (3) which 
is on p and 7 such that 



vy L2 <c(3)o,(m 2 ) 2 ^([m] 2 ) ; 

holds for every (s,t) G A where e 2 = (^Rx-y, [ s !*] 2 ) 

Proof. We have to show the estimate for X ,,J ' — Y* ,J ' fe where «, j, fc € {1, . . . , d}. From Proposition 
[3] and [2] it follows that it is enough to show the estimate for X™ — Y w where 

w G {Hi, ijk, iij : i, j, k G {1, . . . , d} distinct} . 

The cases w = Hi and w — ijk are special cases of Lemma[5]and Lemma[6] The rest of this section 
is devoted to show the estimate for w = iij. □ 

Lemma 7. Let (X, Y) : [0, 1] — > M 2 be a centred Gaussian process and consider 

f(u,v) = E((X u -Y u )X v ). 

Assume that the p-variation of R(x.y) is controlled by a 2D - control uj where p > 1. Let s <t and 
consider a rectangle [<t,t] x [ct',t'] C [s,i] 2 . Let 7 > p. Then 

V,. var (/,[«7,t] x [o-'y]) <6c([ S ,f) 1/2(1/ ^ 1/7) W ([ ( r,r] x [a',r']) 1/7 
where e 2 — ( Rx-y, [s,t] 



Proof. Let u < v and it' < «' G [s, t]. Then 

\\-^-u,v Y u v ) X u ' v ' ) I < ^U.W ^U,W 1^2 1^2 

1/2 / r „xl/2 



< (i?x-y,[s,i] 2 ) y p _ uar (i? ( ^ y) ,[s,i] 2 )" 
and hence 

1 /2 — 

sup \E((X UiV -Y UiV )X u , iV ,)\ <Ko(i?x-r,[s,t] 2 ) w([ s ,t] 2 ) 2p 

u<Cv,u f <Cv' 1 ' 
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Now take a partition D of [ct,t] and a partition D of [ct',t']. Then 

\ E ~ Y U,U+i) x t 3 ,t ] + 1 ) I 

< sup \E((X U!V -Y u , v )X u , y )r P ^ |s((X tiit4+1 -y t4 , t(+1 )^ if ) 

( , ,2\ V 2 (7-p) / r l2 \V2(7/f>-l) /r , r . )1N 

< Foo(i?A-y,[5,t] 2 J w ([*,*] ) w([tr,T] x [a',r']) 
and taking the supremum over all partitions shows the result. 



□ 



Lemma 8. Let (X,Y) : [0, 1] — > M 2 be a centred Gaussian process with continuous paths of finite 
variation. Assume that the p-variation of R(x,y) is controlled by a 2D-control u where p > 1. 
Consider the function 

9 (u,v)^E[(^l-Y^ u ) (x^-Y^ 
Then for every 7 > p there is a constant C = C (p, 7) such that 

V„ r (g,[ S ,tf) <Ce 2 W ([ S ,t] ?V 



holds for every (s,t) e A where e 2 = Voc (^Rx-Y, [s,t] 2 ^j 
Proof. Let u < v and u' < v'. Then 



i-p/7 



u, v 



((Xi 2 ' - Xi 2 )) - ( Y ( 2 > - Yi 2 ))) ((X« - xg,) - (YS - Y« )) 
^ [((Xl v Xl u ) (Y* v Y* u )) ((Xl v , - Xl ul ) (Y 2 V , Y 2 U ,))] . 



Now, 



{ x l,v ~ x l.u) ~ {^I.v ~ Ylu) — ^u,v {X s ,u + X StV ) — Y u>v (Y S;U + Y StV ) 

~\~X U V (^X s v Y s v ^ 7 Y u v ^ Y s v . 

The same way one gets 

(X 8 V i — X s u ,) — {Xs,v' ~ ^s,m') = X u i tV i {X s . u i — Y s ^ u i) 7 (^M',ti' — ^'.d') y.M' 

{X s ^ v f Y s v > ) 7 Y u ' v r ) Yg_ v ' . 
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Now we expand the product of both sums and take expectation. For the first term we obtain, using 
the Wick formula and Lemma [7J 

\E (X u v {X su Y su ) X u * v f (Xg u' Y su r )) | 
< \E (X U V X U > v >) E [(X su — Y su ) (X s u i — Y s u r)]\ 
+ \E [X u , v (X SiU i — Y s , u i)\ E [X u t >v i (X s , u — Y SiU )]\ 
+ \E [X u t tV / (X s<u i — Y s>u t)\ E [X UtV (X s . u — Y SiU )]\ 

Vp~ var (R( X ,Y), [u,v] X [u',v']) Vj- V ar \Rx-Y, [M] 2 ) 

+2K. 



< 



< e 



j—var (R(x,x-y), [u,v] x [s,t]) V 1 - var (R( X ,X-Y), W,v'} x [s,t]) 

l/ 7 



! lu({u,v] x [u',v'}) 1/p u([s,t] 2 ) 



+2e 2 w ([s,t] 



i/p-1/7 



w ([«,«] x [s,t}) lh w ([«',«'] x [s,t]) lh . 



Now take two partitions D,D of [s,i\. With our calculations above, 



Ys,h) Xj.j. +1 \X„ 



h l ( -^ - ./ ^s,i 3 

< Cl e 2 ^([ S ,t] 2 ) J2 "([U,t i+ i]x [tj,t j+1 \y /p 



Y, \e(x um+1 {x s 

tiEDjjED 



+c 2 e 27 w 



(M 2 ) 



7/p-i 



^ x [s,t])w([t 7 -,fj +1 ] x [s,i]) 



< c 3 e 2 ^c([ S ,f) W ([ S ,t] 2 ) 7/ ' , + c([ S ,f) 7/P ^([Mf) 

The other terms are treated exactly the same way. Taking the supremum over all partitions shows 
the result. □ 

The next corollary completes the proof of Proposition [6] 

Corollary 2. Let (X,Y), uj, p and 7 as in Lemma^ Then there is a constant C — C (p, 7) such 
that 



A s,t 1 s,t 



L' 2 



< Ceuj 



holds for every (s, f) G A and i ^ j where e 2 — Vac \Rx-y, [s, 1 
Proof. From the triangle inequality, 



i-p/7 



s,t — 1 s,i 



< 



L 2 



s,t] 



(X£, - Y£ ) dY> 



[ Y%d{Xi-Yi) u 

J[s,t] 



L' 2 



For the first integral, we use independence to move the expectation inside the integral as seen in the 
proof of Lemma |6j then we use 2D Young integration and Lemma [8] to obtain the desired estimate. 
The second integral is estimated in the same way using Lemma |4j □ 
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5.2.3. 71 = 4. 



Proposition 7. Let (X, Y), uj, p and 7 as in Lemma^ Then there is a constant C (4) which 
depends on p and 7 such that 



-4 _ -y 4 

*-8,t 1 S,t I JJ2 



< 



C (4) «*([M] a ) 

/or every (s,t) € A where e = Voo [Rx-y, [s,t] j 

Proof. From Proposition [3] and [2] one sees that it is enough to show the estimate for X™ — 
where 

w G {mm, zjfcZ, iijj, iiij, iijk,jiik : k, I € {1, . . . , d} distinct} . 

The cases w — iiii and w = ijkl are special cases of Lemma [5] and Lemma [6] Hence it remains to 
show the estimate for 



w e {iijj, iiij, iijk,jiik : k € {1, ... , d} pairwise distinct} . 
This is the content of the remaining section. 



□ 



Lemma 9. Let (X,Y). uj, p and 7 as in Lemma^ Then there is a constant C = C (p, 7) such 
that 



A a,t " * s.t 



L 2 



< Ceuj([s,t 



2\ 2 l 



5, if 



2\ 2 p 



/or every (s, t) g A where i, j, k are distinct and e 2 = (Rx-Y, [s, t] 2 ^j 
Proof. From the triangle inequality, 



i-p/7 



X 



i,i,j,k ^ri,i,j,k 



s,t 



s.t 



{s<u<v<t} 



x s,« dX v 



{s<u<v<t} 



Y«dY*dY v h 



L 2 



< 



f - Yl%) dXidX* + f Y%d(X* -Y*) u dX* 

J {s<U<V<t} L2 J{s<U<V<t} 



L 2 



{s<u<v<t} 



Y%dYld{X k -Y k ) v 



For the first integral, we use Proposition [4] and Lemma [8] to obtain 



{s<u<v<t} 



L 2 



[ e (x:-; - y#) (x:-: - y#)] ,//,\ dn x . 

iA?,xA 2 , 



< cie 2 w([s,i] 2 ) 1/7+1/P w([s,t] 2 ) 



2/P 



For the other two integrals we also use Proposition [3] together with Lemma [4] to obtain the same 
estimate. □ 
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Lemma 10. Let {X,Y) : [0, 1] — > M 2 be a centred Gaussian process with continuous paths of finite 
variation. Assume that the p-variation of R(x.y) is controlled by a ID -control u where p > 1. 
Consider the function 

g(u,v)^E[(xil-Y^) (x$ Y$) . 
Then for every 7 > p there is a constant C — C (p, 7) such that 

Yt-var {9, Mf) < Ce 2 uj ([s,t] 2 ) 1/7+2/P 

/ 2 \(!-p/7) 

holds for every (s,t) £ A where e = Voo I Rx-y, [s,t] J 

Proof. Similar to the one of Lemma [8] applying again Wick's formula. □ 



Corollary 3. Let (X, Y), uj, p and 7 as in Lemma^ Then there is a constant C — C (p, 7) such 
that 



A s,t 1 s,t 



< 



c^(M a )*«([-,t] a )* 



holds for every (s, t) £ A and i ^ j where e 2 = Vqo (r,x-y, [s, 1 



(1-P/7) 



Proof. The triangle inequality gives 



A s,t 1 s.t 



L 2 



< 



s,t] 



s,t] 



[»,*] 



L 2 



s,t] 



y:;',;'./(.v )•)„ 



L 2 



For the first integral, we move the expectation inside the integral, use 2D Young integration and 
Lemma [10] to conclude the estimate. The second integral is estimated the same way applying 
Lemma |4j □ 



It remains to show the estimates for X™ — Y w where w £ {iijj, jiik}. We need to be a bit careful 
here for the following reason: It is clear that Xq 4 ] j = J, Q ^ X^ 1 dX^. One might expect that also 

Xqj ! = Jj Q j, Xl d~yL\i % holds, but this is not true in general. Indeed, just take / (u) = g (u) = u. 
Then 



1 / (u) d QT g (v) dg (v)\ = i jf ud (u 2 ) 



i 2 du = 



but 



/ (u) dg (u) dg (v) 



du\ du2 du% = 



A :1 
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One the other hand, if g is smooth, we can use Fubini to see that 



/ (u) dg (u) dg (v) 



[OT] 2 



f(u)g' (u)g' (v) 

!{«<«} du dv 



[0,1] 2 



/ ( u ) 9 ( u ) 9 {v) dudv 



+ 1 [ f {v)g' {v)g' {u)l {v<u} dudv 
2 i[o,i] 2 

I [ (/ ( u ) Mu<v} + f {v) l{v<u}) g' (u) g' (v) du dv 

1 J[o,i] 2 

\\ f (uAv) g' (u) g' {v) du dv 

2 J[0A] 2 

If f(uAv) d(g(u)g(v)) 
1 Jfo.il 2 



'[0,1] 2 

where the last integral is a 2F> Young integral. Hence we have seen that an iterated ID-integral 
can be transformed into a usual 2_D-integral. We will use this trick for the remaining estimates. 

Lemma 11. Let f: [0, l] 2 — > K be a continuous function. Set 

f (tii, u 2 ,v 1 ,v 2 ) = f (tii A u 2 ,Vt A v 2 ) . 

(1) Let tii < Wi,ti2 < u 2 , v\ < v\,v 2 < v 2 be all in [0, 1]. TTien 

/ tii, tii \ 

7 U 2 ,U 2 
V Xl Vi 

\ v 2 ,v 2 J 

where we set 



f 



u, u 

V, V 



\u,u\ = 



V,V 



[ui,ui] n [u 2 ,u 2 ] if [«i,iti] ("1 [u 2 ,u 2 ] 7^ 
[0,0] if [ui, tii] n [u 2 ,u 2 ] = 

f [Ul,t5l] ("1 [U2,«2] «/ n [v 2 ,v 2 ] ^ 

\ [0,0] if [v 1 ,v 1 ]n[v 2 ,v 2 }=Q) 

(2) For s < t, a < t and p > 1 we have 

V p (/, [s, t] x [a, r]) = V p (/, [s, t] 2 x [a, r] 2 ) . 
Proof. (1) By definition of the higher dimensional increments, 



/ 


tii, tii 


\ 




( fi l ^ 




/ Si 








ui 


\ 




(ui\ 




u 2l u 2 




= I 




-/ 


«2 




-/ 








+ 1 






Vl 




Vl 












Vl 


V 


v 2 


J 




v «2 y 




V -2 








v 2 


J 




K "2 J 



f (tii A u 2 , Vi A v 2 ) — f (tii A u 2 , V! A v 2 ) 
-/ (tii A u 2 , Vi A v 2 ) + f (tii A tt2, t>i A v 2 ) . 
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By a case distinction, one sees that this is equal to / (u, V\ A V2) — / (u, v\ A 1*2) ■ One goes 
on with 



/ 



= ft (vi A V2) — ft (vi A V2) — ft (v\ A U2) + ft (i>i A U2) 
= h(v)-h(v) 

where ft (•) = / (u, •) — / (u, •) .Hence 

u, u 



1 u x 






f 


Ux,Ui 


\ 




f 


Ul,Ui ^ 




( 


Ui,Ui ^ 




f 






U2 


= I 








-f 




U2, U2 


-f 




"2, U2 


+ f 




U2,U 2 


Vl 


Vl 




Vl 






Vl 




Vl 




Vl 


\ v 2 


V2 J 








J 






V2 J 




\ 


V2 J 




\ 


V2 J 



ft (v) - ft («) = / (u, v) - f (it, u) - / (u, u) + / (u, v) = f 



(2) Let D be a partition of [s, t] and Z) a partition of [a, t\. Then by 1, 



E 

ti£D,ie£> 



tj,tj + i 



E 

U£D,ieD 



< 



V^(/,[s,t] 2 x [<t,t] : 



hence V^, (/, [s,t] x [<r, r]) < V^, f/, [s,i] 2 x [ct,t] 2 ^. Now let Di,D 2 be partitions of [s,t] 

and Di, D2 be partitions of [<r, r]. Set D ~ Di U D2, D = Di U L>2- Then 13 is a partition 
of [s, t] and D a partition of [<r,r] (see Figure 1 below). 



t t 



%+2 



h+3 ^'+4 



D 



Figure 1. 
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By (l), 
E 

t] i eD u t 2 Z2 €D 2 



f 



f 2 f 2 

n ri 

J 2 



E 



<(V p (/,[M]x [a,r]))* 



and we also get V p (/, [s,t] 2 x [ct,t] 2 ) < V p (/, [s,t] x [cr,r]). 



□ 



Lemma 12. Lei (X, F) : [0, 1] — >• M 2 6e a centred Gaussian process with continuous paths of finite 
variation and assume that uj is a symmetric control which controls the p-variation of R(x,y) where 



p>l. Take (s,t) £ A, 7 > p and set e 2 = (r X -y, [s,t]' 



i-ph 



(1) Set f (ui,U2,vi,v 2 ) = E [X Ul X U2 X Vl X V2 ]. Then there is a constant C\ — C\(p) and a 
symmetric AD grid-control Cj\ which controls the p-variation of f and 

V P (/, [*, tf) < u>! ([s, i] 4 ) 1/P = CW ([s, t} 2 ) 1 . 



(2) Set f{u 1 ,u 2 ,v 1 ,v 2 ) = E 
that 



-(2) 



.(2) 



uiA« 2 Ji s »iA»2 • Then there is a constant C 2 — C 2 (p) such 

v p (j,[s,tf) <c 2 u([ s ,t} 2 y . 

(3) Set 

g(u 1 ,u 2 ,v 1 ,v 2 ) =E[(X Ul X U2 -Y Ul Y U2 )(X Vl X V2 — Y Vl Y V2 )] . 

Then there is a constant C3 = C3 (p, 7) and a symmetric AD grid-control ui 2 which controls 
the "/-variation of g and 

V 1 (.g, [s, tf) < Cj 2 {[s, tf) V7 = C 3 e 2 uj ([s, t} 2 ) lH ^' P . 

(4) Set 

9 U2,v u v 2 ) = E [(x< 2 > - Y< 2 >) fx< 2 ) - Y< 2 >) 

V / S.U1AU2 \ / s,v±Av2 

Then there is a constant C\ — C4 (p, 7) swc/i £/ia£ 

V 7 ($,[*,t] 4 ) <C 4 e 2 W ([ S ,f) 1/7+1/P . 
Proof. (1) Let iti < tti, u 2 < u 2 , v\ < v\, v 2 < v 2 . By the Wick-formula, 

\E [X Uli y il X U2i y i2 X Vli V 1 X V2i y 2 ] | 

< 3 P \E [X Ul u 1 X U2 n 2 ] E [X Vl ^ 1 X V2 v 2 }\ + 3 P \E [X Ul fi 1 X Vl y 1 ] E [X U2 fi 2 X v 
+3 P 1 \E [X Ul .u 1 X V2 . V2 ] E [X U2 ,u 2 X Vl , Vl ]\ p 

< 3 p_1 cj ([iti.ui] x [u 2 ,U2])u([vi,vi} x [^2,^2]) 
+3 p ~ 1 w([u 1 ,w 1 ] x [v 1 ,v 1 })uj([u 27 u 2 ] x [U2,u 2 ]) 
+3 p_1 w([mi,ui] x [v 2 ,v 2 ])uj ({u 2 ,u 2 } x [ui.ui]) 

= : wi ([tti,iii] x [tt 2 ,w 2 ] x x N,^]) • 
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It is easy to see that u)i is a symmetric grid-control and that it fulfils the stated property. 

(2) A direct consequence of Lemma [4] and Lemma 11 

(3) We have 



Y, 



Hence for Ui <ui,u 2 <u 2 ,vi <v\,v 2 < v 2 , 

( Ux,Ux \ 



I 



U2,U 2 



= E 



( X - ^)ui,Ul X U2,U 2 (X - Y) Vi ~ x X V2}V2 



-E 
-E 
-E 



Y, 



(X-Y), (X-YY,X V 



(X — Y) U2 il2 Y Vl ^ tl (X ~ Y) V2 ~ 2 



(X Y) Ult u 1 X U2} u 2 Y Vl .v 1 



For the first term we have, using Lemma [7J 



E 



E 

< 3 7 " 1 

< 3T- 1 e 2 'i' 



(X-YY A X U2ta2 (X-Y) 



«i,6i Xv a ,V2 



>Ul,Ul XuiA-2 



(X-Y) 
(X-Y)^ (X -Y) 



E 



(X-Y) 

7 



(X-Y) 



«l,Ul X V2 ,V2 



7 



E 



X,, 



\E [X U2 n 2 X V2 
(X-Y\ 



< 



uj([ui,Ui] x [u 2 ,u 2 })uj([vi,vi} x [v 2 ,v 2 ]) 
+3 7_1 e 27 o;([ui,'Ui] x [vi,vi]) u ([u 2 ,u 2 ] x [v 2 ,v 2 ])* 
_l_37-i e 27 w ([ Sl t] 2 ^ " lj([ui,Ui] x [v 2 ,v 2 ])l)([u 2 ,u 2 ] X 

3 7_1 e 27 o; ([s,t] J" (w([iii,i2i] x [u 2 , u 2 ]) u ([vi, n] x [v 2 ,v 2 ]) 

+U)([ui,Ui] X [«l,Wl])w([u2,U2] X [v 2 ,V 2 ]) 

+uj([u 1 ,u 1 ] x [v 2 ,v 2 ])u ([u 2 ,u 2 ] x 

= : X [« 2 ,«2] X X [«2,«2])- 

cj is a symmetric grid-control and fulfils the stated property. The other terms are treated 
in the same way. 
(4) Follows from Lemma [8] and Lemma 11 

□ 

Corollary 4. Let (X, Y), uj, p and 7 as in Lemma^ Then there is a constant C — C(p, 7) such 
that 



■^■a.t 1 s,t 



L' 2 



< Ceuj ([s,t 



2\ 2 ~i 



holds for every (s, t) £ A and i ^ j where e 2 = \ Rx-y, [s, 



( |s, t] 



2\ 2 P 
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Proof. As seen before, we can use Fubini to obtain 



A s,i 



dxi t dxi = \ 



and hence 



A si 1 s,t 



< - 

L2 - 2 



»)*]■ 



( 

\ -**-S,Ul AU2 



1 S,U\f\U2 



L- 



lS ,t] 2 • " 



L 2 



We use a Young 4D-estimate and the estimates of Lemma |12| to see that 

2 



s,t\ 



( 



~X-s,uiAu 2 ,«iA«2 ) ^ 



L 2 



s,t] 



< C1 e 2 uj ([ s ,t] 2 y n ' uj([ s ,t] 2 y /p . 



l/ 7 



AV2 S,ViAv 2 



dE [X^X^X^X^ 



The second term is estimated in the same way using again Lemma 12 



□ 



Lemma 13. Let f : [0, l] 2 — » K and g: [0, l] 2 x [0, l] 2 — > K oe continuous where g is symmetric in 
the first and the last two variables. Let (s,t) € A and assume that f (s, •) = f(-,s) — 0. Assume 
also that f has finite p-variation and that the q-variation of g is controlled by a symmetric 4D 
grid-control Cj where ~ + ~ > 1. Define 

v) = I f(u 1 Au 27 v 1 Av2)dg(u 1 ,u 2 ;v 1 ,V2) 

J [s,u] 2 X [s,v] 2 

Then there is a constant C = C (p, q) such that 

V q M 2 ) < C% (/; M 2 ) w ([M] 4 ) 

Proo/. Set 

/ (ui,u 2 ,vi,v 2 ) = f(ui A u 2 ,vi A « 2 ). 
Let u < v and u' < v'. Note that 



V? 



lr 



lr 



1 



lS,V\*X [S,V'\" ^[3,11]' X [S,V 

= l{[s,v] 2 \[s,u] 2 )x[s,v'] 2 — l([s, t >] 2 \[s,u] 2 )x[s,u'] 2 
= l([s lt ;] 2 \[s,M] 2 )x([s,ti'] 2 \[s,u'] 2 ) 

l2 r,i 1 r l2 



If we take out the square [s, u] of the larger square [s, v] , what is left is the union of three essentially 
disjoint squares. More precisely, 



[s,v] \[s,u] =[u,v] U ([s, u] X [u, v]) U ([u, v] X [s, u]) . 
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The same holds for u' and v'. Hence, 

([s,v} 2 \ [s,u]A x ([s,v'} 2 \[s,u'} 2 ' 

= ([w, v] 2 U ([s, u] x [u, v]) U (kt, v] x [s, u])) 

x ([u', v'f U ([s, u'] x [«', i/]) U ([u',v'] x [s, «'])) 
= ([«, «] 2 x u'] 2 ) U ([«, v] 2 x [s, u'] x [«', */]) U ([«, «] 2 x [«', u'] x [s, «']) 

U ([s, u] x [u, u] x [vf, v'] 2 ) U ([s, u] x [u, v] x [s, u'] x [it', i/]) 

U([s,u] x [u,v] x x [s,u]) 

U ([u, v] x [s, it] x kt', i/] 2 ) U ([it, v] x [s, u] x [s, i/] x [it', u']) 
U([u, u] x [s,u] x [u',v'] x [s,u']) 

and all these are unions of essentially disjoint sets. Using continuity and the symmetry of / and g 
we have then 



u, v 



([s,v] 2 \[s,u] 2 )x{[s,v'] 2 \[s,u<] 2 ) 

fdg + 2 



fdg 



u.Dr x \u ,v 



u,v\* x \s,u x \u .v 



+2 



S.U\ X \U,V\ X \u ,v 



fdg 



fdg + A fdg. 

' [s,u] x [u,v] x [s.u'] x [u' ,v'] 



For the first integral we use Young 4D-estimates. Since f (s, ■,-,■) = ... = / (•, •, •, s) = 0, we can 
proceed as in the proof of Lemma [2] and use Lemma [TT] to see that 



[u,i>] 2 X [u',l/] 2 



fdg 



< 



< 



ciV p (f, [s,tf)v q (g,[u,v] 2 x [u',v'} 2 ) 
oiV p [f, [s,tf)u([u,v] 2 * [u',vf) 1/q 



For the second integral, we have 



u.vr x \s,u' x \w ,v' 



u.ur X \s,u' \ x\u' ,v' 



fdg 

f(ui A u 2 ,v 1 A u 2 ) (ui,u 2 ;v 1 ,v 2 ) 
f{ui A u 2 ,wi)d[.g(ui,u 2 ;wi, u') - g (ui,u 2 ;vi,u')] 
We now use a Young 3-D-estimate to see that 



u.v ^ X s.u' 



/ 

•/ [u,d] 2 x [s,-u'] x [u' ,v' 



fdg 



< c 2 U p (/(-A-,-),[s,t] 3 

xv q (.9 (•,•;•)«')- 9 (; •',;■">'), [u,v] 2 x [s,u'ij 



:»,2 
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one can show that 



V P (/ (• A v ) , [«, ^) = V p (/, [s, t] 2 ) . For <?, we 



have 



As in Lemma 

Vq [9 (;■;; v') -g {■,■;■, u') ,[u,v] 2 x [s,u']) < V q (g, [n,vf x [s,u'} x [«>']) 

1/9 



^[w, w] 2 x [s, t] x [vf, v'] 



Hence 



fdg 



[u,v] 2 x [sjtt'] x [u',i>'] 

Similarly, using Young 3-D and 2D estimates, we get 



< c 2 V p ( /, [5, t) ) Q ( [u, x [s, £] x [u , v 



and 



S.li X Ki.V X \U' ,V' 



S.U X \ u, v x ,sytr X U '.V' 



< 



C3V p (/, [s,t] a ) a ([«,*] 



x pit, ul x \v! , v'] 2 



1/9 



1/'/ 



< 



c±Vp (/, [s, ^] 2 ) w ([s, i] x [«, v] x [s, t] x [u 1 , v'}) 



1/'/ 



Putting all together, using the symmetry of to we have shown that 



u, v 
u',v' 



< 



CbV p (j, [s, if) ui ([u, v] x [vf, v'] x [s, tf^j 



Since £2 ([u, v] x := a; ( [tt,f] x [«',?/] x [s,t] J is a 2D grid-control this shows the claim. □ 

We are now able to prove the remaining estimate. 

Corollary 5. Let [X, Y), uj, p and 7 as in Lemma^ Then there is a constant C — C (p, 7) such 
that 



-irj,i,i,k -*rj,i,i : k 



L 2 



< Ccuj 



holds for every (s,t) € A and i,j, k pairwise distinct where e 2 — Voo [Rx-Yi I s , t]' 
Proof. From 



i-p/7 



A? 



X i,Ul dX Ul dX U2 = 2 



X s,u 1 Au 2 d ( X u 1 X u 2 ) 



we see that 



Hence 



< 



s\t = 



[s,w] 



xJ s,u 1 Au 2 d( xl Ul x l, 2 ) dX w . 



X 



j,i,i,k ~^-j,i^i,k 



s.t 



s.t 



L- 



f*iH< +\ f^HdXt +\ (\z(w)d{X k -Y 

Js L 2 L J s L 2 L Js 



L 2 
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where 



*2 (to) 
*3 (to) 



f x J — W i ^ ( x i \ 

I -^S,tllAli2 S,UiA«2 J V til t»2/ 



Ys,U!Au 2 d (^Ui ^112 Y U 1 Y U 2 ) 



fY' Y* 

- 1 a.uiAaj u V. ui U2/ 



We start with the first integral. From independence and Young 2D-estimates, 
[\ l {w)dX k w = [ B^H^WldBf^X] 

< Cl y p [*! (■) *! (■)] , [s, f] 2 ) y p [«, t] : 



Now, 



(wi)fi (w 2 )] 



s,uii] 2 x[s,u) 2 ] 



-Y 3 



S,U\/\U2 S 1 U\/\U2 } \ S,V±AV2 * S,V\AV2 



Y 



dE [X l Ui X l U2 X l Vi X l V2 ] 



In Lemma 
grid-contro 



12 



we have seen that the p-variation of E \XJ X] X 1 X. 1 ] is controlled by a symmetric 
Cj\. Hence we can apply Lemma 13 to conclude that 



4 

2/p 



(•)*!(•)]. [«, *] 8 ) < c*7 7 (ifc-y; [M] 2 ) «i ([M] 
< c 3 e 2 u ([s,t] 2 ^ u([s,t} 2 ^ 

Clearly, V p (^Rx k [ s j^] 2 ) < w (j s >^] 2 ) an d therefore 



1/p 



* 1 (to) dX\ 



L' 1 



3/p 



Now we come to the second integral. From independence, 



*2 (to) dX\ 



L 1 



E[^ 2 (t0i)* 2 («*)] dE [X k Wi X k W2 ] 



< c 5 V 7 (E [* a (•) * 2 (•)] , [ a , t] 2 ) y p [ S) tf 



Now 



E[^ 2 (iui)* 2 (to 2 )] 



[s,uii] 2 x[s,iu 2 ] 



[a,t«l] x[s,ui 2 ] 



J Ali2 2 S,Vl AU2 



^ [(^«1^«2 _ ^1^2) (^1^2 _ Y V1 Y V 2 )] 

dg{u 1 ,u 2 ,v 1 ,v 2 ) ■ 
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In Lemma 12 we have seen that the AD 7- variation of g is controlled by a symmetric 4£> grid-control 
u>2 where 



LJ 2 



O*] 4 ) 



1/7 



c 6 e lo [s 



Hence 



i/p+1/7 



V, (E [M> 2 (•) * 2 (•)] , [«, if) < c 7 V p (r Y3 ; [«, tf) Cj 2 ([s, i] 4 ) ^ < c s e 2 u; ([s, i] 2 ) 
This gives us 



2/P+1/7 



t 2 

k 



*2 M 



< 



c 9 e 2 W (M 2 ) 1/7 c(M 2 ) 



L 2 



,3/p 



For the third integral we see again that 



*3 H d(x k -Y k ) t 



L 2 



s,t] 2 



E [* 3 * 3 O2)] d£ f(^ fe - (X k - Y h 



From 

[s,M>l] 2 X [S,t0 2 ] 

we see that we can apply Lemma |T3| to obtain 



< c 10 V p (E [9 a (•) * 3 (•)] , [*, t} 2 ) y 7 (r X -y, [a, t] 

dE \Y l Y l Y l Y i 1 

urj [ I u 1 I u 2 1 v 1 1 v 2 l 



E 



Y'J Y 3 

1 S,UiAu 2 S,V±AV2 



V, 



(E [* 3 (•) *3 (•)] , [s, tf) < c n V p (r Y] ■ [s, i] 2 ) lj ([s, if) VP < ciiw ([«, i] 2 ) 



,3/p 



Clearly, F 7 [ s ^] 2 ) < e 2 ^ f[ s >i] 2 ) and hence 



*3 H d(x fc -r fc ) t 



< ci 2 e w 



([ S ,f) 1/7 c([ S ,f) 



L 2 



3/p 



which gives the claim. 



□ 



Remark 2. Even though Proposition^ [f>| an d [?| a re onZy formulated for Gaussian processes with 
sample paths of finite variation, the estimate (5.1) is valid also for general Gaussian rough paths 
for n = 1, 2, 3, 4. Indeed, this follows from the fact that Gaussian rough paths are just defined as L 2 
limits of smooth paths, cf. [S]. 

5.3. Higher levels. Once we have shown our desired estimates for the first four levels, we can use 
induction to obtain also the higher levels. This is done in the next proposition. 

Proposition 8. Let X and Y be Gaussian processes as in Theorem^ Let p, 7 be fixed and lo be 
a control. Assume that there are constants C — C (n) such that 



u (s,ty 



2P, 



holds for n — 1, . . . , [2p] and constants C = C (n) such that 



|X™ t — Y"J < C (n) euj (s, t)^ ~ 
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holds for n = 1, . . . , [2p] + 1 and every (s, t) 6 A. Here, e > and (3 is a positive constant such that 



/3 > 4p 1 + 2 



)([2p] + l)/2p 



[2p] + 1 

2p 



- 1 



where (, is just the usual Riemann zeta junction. Then for every n£H there is a constant C = C (n) 
such that 



■y*n \rn 



, i oj(s,t) 2 " 



/io/rfs /or every (s,t) G A. 

Proof. From Proposition [l] we know that for every n € N there are constants C (n) such that 



X" 



* I L 2 5 I s,£ | JJ2. 



holds for all s < t. We will proof the assertion by induction over n. The induction basis is fulfiled 
by assumption. Suppose that the statement is true for k = 1, . . . , n where n > [2p] + 1. We will 
show the statement for n + 1. Let D = {s = to < t\ < . . . < tj = t} be any partition of [s,t]. Set 



(i,xL,..,x:, t ,o)er 
x s , tl ® . . . ® x t ._ 1>t 



and the same for Y. We know that lim| d\^o Xf t = S n +i (X) s ( a.s. and the same holds for Y 
(indeed, this is just the definition of the Lyons lift, cf. [T71 Theorem 2.2.1]). By multiplicativity, 
■Kk (Xf t ) = Xg t for k < n. We will show that for any dissection D we have 



n n+1 (Xf t - Y® t )\ L 2 < C (n + 1) (s, t)^ 



ui(s,t) 2 > 



We use the notation (X D ) := ir^. (X^ 1 ). Assume that j > 2. Let D' be the partition of [s,t] 
obtained by removing a point ti of the dissection D for which 



lo (U-i,U+i) < 



2uj(s.t) r - \ o 

. , ' tor 7 > 3 

j— i J — 

u> (s, t) for j = 2 



holds (Lemma 2.2.1 in [T7] shows that there is indeed such a point). By the triangle inequality, 



(X D -Y D ) 



< 



L- 



(x D -x D ') 



- ( Y D - Y D 



L 2 



X D ' - Y D ' 



;s<) 
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We estimate the first term on the right hand side. As seen in the proof of [T71 Theorem 2.2.1], 

, n+l 



YLiK-ut Set R' = Y'-X . Then 

V\"+ 1 ..nA^ 1 



rD 

s,t 



(x- t -Xg) W -(Yfc-Y 

n 

V x' , x? + , 1_l - fx' , + Rl t )( XT+ 1 - 1 + R"+/"' 

i=i 

n 

■ /v t i _i,t i xv t<,i i+ i "-ti-ify k,U 



l-l 
i+i ' 



1=1 



By the triangle inequality, equivalence of L 9 -norms in the Wiener Chaos, our moment estimate for 
X fe and Y fe and the induction hypothesis, 



(x^ t — X^ — — Y® t 



,\ 1+1 



L- 



< Cl (n + l)J2\K-„ 



i=i 



L- 



K t„t i+ i 



L 2 



R 



ti-i,U 



1 h,ti+i 



L 2 



< c 2 (n + 1) 2_^e^(U,U+i) 2 ' 



1=1 



/. . xjL Cj(ti_i,ti) 2 " L0{ti,t i+ i) 2 » 

-euj (H-i, H) 



x J- W (ti_i,ti) 2p W 2p 

< 2c 2 ew(s,t) 2T 2_/ 







rc 2 ecj I 



1 ■ 

2p 



1=0 



(0 (^) ! 



< 4p C2ew ( S ,t)^ w( ^ 1 '^ +l)2P 



where we used the neo-classical inequality (cf. [T3]) and superadditivity of the control function. 
Hence for j > 3, 



(xf t — XfO 



n+l 



n+l 



< 4pc 2 euj(s,t) 2 -> ry-^ 



< 



2 \ 2p , i uj(s,t) 2 " 

Apc 2 £Lo(s,t) 2 ~> 



3-1 
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For j — 2 we get 



Y D -v-D' 
A s,t ~~ ^s,t 



n+1 



1 s,i 1 s.t 



but then £)' = {s, t} and therefore 
points we see that 



t 1 s.t 



n+1 

L 2 



, 1 UJ (s,t) 2 " 
= 0. Hence by successively dropping 



L 2 



v A s,i — * s.t) 



< 



L- 



3=3 



2 \ 2 " 



1 + E — I 4pc2ew 



i Lu{ s ,ty? 



holds for all partitions D. Since n > [2p] + 1, 



J=3 



^(l + 2^ ( C (MH 



Pp]+i 



2 P [2p] + l 

< 2^~ C 



and thus 



By the choice of /3, we get the uniform bound 



[gp] + 1 
2p 



, i w (s,i) 2p 
c 2 ew(s,i) 2T 



w (M) 2p 



L 2 



which holds for all partitions D. Noting that a.s. convergence implies convergence in L 2 in the 
Wiener chaos, we obtain our claim by sending \D\ — > 0. □ 

Corollary 6. Let (X,Y), lo, p and 7 as in Lemma^ Then for all n S N there are constants 
C = C (p, 7, n) such that 



|X?, t -Y? it |, , <Ceu;{[s,t] 2 )^u;({s,t] 2 ) 



holds for every (s,t) £ A where e 2 = Vqq \ Rx-Yi [0, l] 2 ^ 



1-/3/7 



Proof. For n = 1,2,3,4 this is the content of Proposition [5j [6] and [7] By making the constants 
larger if necessary, we also get 



< c (n) ew ([s, t] 



2\ 2-, 



with f3 chosen as in Proposition [8] We have already seen that 



X" t L 2 , Y"J < c(n) 
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holds for constants c (n) where n = 1, 2, 3. Since p < 2, we have [2p] + 1 < 4. From Proposition [8] 
we can conclude that 

|x;, t -Y J " |t | La < C («) e a,([ B ,t] 2 )^ fa,([B ' f]! ' ) * 

holds for every n G N and constants c (n). Setting C (n) = ./^"K, gives our claim. □ 

6. Main result 

Assume that X is a Gaussian process as in Theorem [T] with paths of finite p- variation. Consider 
a sequence (Afc) feeN of continuous operators 

A k : C p - var ([0, 1] , R) -> C 1 ~™ r ([0, 1] , R) . 

If x = (x 1 , . . . ,x d ) e C p ~ var ([0, 1] ,M d ), we will write A fc (x) = (A fc (a; 1 ) , A fc (x d )). Assume 
that Afc fulfils the following conditions: 

(1) A k (x) -> x in the H^-norm if fc -> oo for every x € (jp~ var ([o, 1] , M d ) . 

(2) If i?x has finite controlled p- variation, then, for some C = C (p), 

SUp |-R(A h (A-),A|(X))| r !i2 < C l p _„ a r;[0,l] 2 " 

Our main result is the following: 

Theorem 5. Let X be a Gaussian process as in Theorem ^for p < 2 and K > V p (r x , [0, l] 2 ) . 

Then there is an enhanced Gaussian process X with sample paths in C°*- var ([0, 1] , (R d )) 
w.r.t. (A k ) ketj where pe (2/3, 4), i.e. 

(^](A fc (A)),X)| L ,.^0 

/or fc — > oo and every r > 1 . Moreover, choose 7 smc/i £/iat 7 > p and A + A > 1 . TTien /or q > 2"f 
and every N £ N there is a constant C = C (q, p, 7, AT, A) suc/i i/iat 

|p 9 _„ ar (S N (A fe (A)) , ^ (X))| tr < Cr N ' 2 sup |A* (X) t - A t |^ 



0<t<l v 



holds for every fceN. 



Proof. The first statement is a fundamental result about Gaussian rough paths, see [3 Theorem 
15.33]. For the second, take 5 > and set 



7' = (1 + 5) 7 and p' = (1 + 5) p. 
iry we can assume that A _|_ J_ > 

Wfe.j (A) = |#(A fc (X),Ai(X))|p,_ rar ., 



By choosing 5 smaller if necessary we can assume that A _|_ _L > 1 anc i > 27'. Set 



for a rectangle A C [0, l] 2 and 

1 p' 1 p_ 

Cfc,i = V^oo (ji(A fc (X)-A|(X))) [0, I] 2 ) ^ = (R(A h (X)-A t (X)), [°> if) 
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From Theorem [2] we know that u) k ,i is a 2D control function which controls the p'-variation of 
R(A k (x).A l (x))- From Corollary [6] we can conclude that there is a constant c\ such that 

1 Tl-l 

tt„ (S N (A fc (X)) M - 5jv (A; (X)) S J < cie fc ,io; fclJ ([M] 2 ) 27 ' ([M] 2 ) 2 "' 
holds for every n = 1, . . . , N, (s, t) G A and fc, Z e N. Now, 



Wfe,i ([a,*] 



([O,!] 2 )"' 



< w 



([ S ,t] 2 )^c M ([0,l] 2 ) 



i _ n-i 

2\ 2p r 2 T ' 



From Theorem [2] and our assumptions on the Afe we know that 

u, M ([0,l] 2 ) 1/P <c 2 \R x \ pl _ var . [Q1? <c z V p (r x ,%1] 2 ) <c A (p,p',K). 
holds uniformly over all fc, I. Hence 

tt„ (s N (A fe (X)) s t - S N (A (X)) s t ) I < c 5 6 fcjiWfc , ; ([«, i] 2 ) 57 . 
Proposition [T] shows with the same argument that 

for every € N and the same holds for Sn (A; (X)) s t . From [9l Proposition 15.24] we can conclude 
that there is a constant c% such that 

\p q -var (Sn (Ak (X)) , S N (A (X)))\ Lr < c s r N / 2 e k}l 

holds for all fc, Z € N. In particular, we have shown that (Sjv (A^ (A))) fcgN is a Cauchy sequence in 
U and it is clear that the limit is given by the Lyons lift Sn (X) of the enhanced Gaussian process 
X. Now fix fc e N. For every Z e N, 

\p q _ var (S N (A k (X)),S N (X))\ Lr < \p q ^ var (SN(A k (X)),S N (A l (X)))\ Lr 

+ \Pq-var ( S N (A/ (X)),S N (X))| 



< car^efc,, + |p„_„ or (5jv (A (X)),5 W (X))| 



It is easy to see that 



and since 



£fc,Z -> (-R(A*(X)-X); [0; I] 2 ) 



1 p_ 

2 \ 2 2 T 



for Z 



q—var 



(S N (Ai(X)),S N (X))\ Tr ^0 for Z^ oo 



we can conclude that 

|p,_„ or (5 W (A fe (X)) , S w (X))| £r < cb^/Voo (i? ( A fc( x)-x), [0, 1] 
holds for every fc g N. Finally, we have for [a, r] x [cr', r'] C [0, 1] 

0\ T 



I _ _2_ 
2\ 2 2 7 



i? 



(A fc (X)-X) ( 



<4 sup LR(a„(X)-X) (M) Ldxc 

0<s<t<l 
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and hence 



Voo \R(A k (X)-X)> [0, !] 2 ) < 4 SU P \R(A k (x)-x)(s,t) 

K ' 0<s<t<l 

Furthermore, for any s < t, 



Rdxd 



\R(A k (x)-x) (s,t)\ Rdxd < \A k (X) s - X s \ L2(Rd) \A k (X) t - X t \ L2(Rd) < sup^ |A fc {X) t - X t \ L2(Rd) 



and therefore 

Voo \ R(A k (X)-X), [0, 1] 

which shows the result. 



J ; " 27 < c 9 sup |A fe (X) t - X t | 2 ,^ ( 
o<t<i 



□ 



The next Theorem gives pathwise convergence rates for the Wong-Zakai error for suitable ap- 
proximations of the driving signal. 

Theorem 6. Let X be as in Theorem^ for p < 2, K > V p (r x , [0, if) and X^ = A k (X). 
Consider the SDEs 

(6.1) dY t = V(Y t )dX t , Y eR n 



(6.2) 



dY t 



(fc) 



= V{Y t (k) ) dX K t K \ Y" W =Y e 



where \V\ LipB < v < oo for a 8 > 2p. Assume that there is a constant C\ and a sequence (efc)fceN C 
V such that 



sup 

0<t<l 



X 



(fe) 



x t 



L- 



< del /p for all k e N. 



Choose 77, q such that 

< rj < min 



111 1 



and q G 



2p 



l-2pn 

Then both SDEs (6.1 1 and (6.2) have unique solutions Y and Y^ and there is a finite random 
variable C and a null set M such that 



(6.3) 



y( fe ) (w) - y (w) 



< 



oo;[0,l] 



y w ( w ) - y ( w ) 



g— uar;[0,l] 



holds for all k € N cmd 6 f2 \ M. T/ie random variable C depends on p, g, 77, j/, if, Ci, the 
sequence (eft)fceN an d the driving process X but not on the equation itself. The same holds for the 
set M. 

Remark 3. Note that this means that we have universal rates, i.e. the set M and the random 
variable C are valid for all starting points (and also vector fields subject to a uniform Lip -bound). 
In particular, our convergence rates apply to solutions viewed as C l -diffeomorphisms where I = 
[0-q], cf. Theorem 11.12] and [7]. 

Proof of Theorem^ Note that 7 > p and 4 + 4 > 1 is equivalent to < 7^ — ^ < p~ §• Hence 
there is a 7 > p such that n = ^ — ^- an d j + > !• Furthermore, 27 = \_2 pv < Q- Choose 
7j > 7 such that still 2j 1 < q and V < jp ~ 277 < p ~ 2 ' nence p + ^ > 1 hold. Set a := 
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2^ — ^ r\ > 0. From Theorem JHj we know that for every r > 1 and JVeN there is a constant ci 

such that 



q—var 



(S N (XW),S N (X)) 



< c\r N l' 1 sup 

0<t<l 



X { t k) - X t 



L 2 



l l 

^ AT/2 2p 2 T 



holds for every k £ N. Hence 



P „ (^(X( fe )),^(X)) 



for every k £ N. From the Markov inequality, for any <5 > 0, 



y q—var 



(s N (x^),s N (x.)) 



> s 



< 



1 

fc=i 



J q—var 



(S N (XW),S N (X)) 



fe=l 



By assumption, we can choose r large enough such that the series converges. With Borel-Cantelli 
we can conclude that 



(S N (XW),S N (K)) 







outside a null set M for k — > oo. We set 



„ Pq-var( S N(X (k ^),S N (X)) 

C 2 := sup — = 



< oo a.s. 



kef* c fe 

Since C'2 is the supremum of ^-measurable random variables it is itself J-"-measurable. Now set 



N = [q] which turns p q _ var into a rough path metric. Note that since 9 > 2p, (6.1) and (6.2) have 



indeed unique solutions Y and Y^K We substitute the driver X by SW(X) res P- by Sn(X^) 
in the above equations, now considered as RDEs in the g-rough paths space. Since 9 > q, both 
(RDE-) equations have again unique solutions and it is clear that they coincide with Y and Y^ k \ 
From 

Pq _ var (S N (XM), l) < p q _ var (s N (X^), S N (X))+ Pq _ var (S N (X) , 1) < Ci+p t _ var (S N (X) , 1) 

we see that for every u € fi \ M the Sff(X^ k > (w)) are uniformly bounded for all k in the topology 
given by the metric p q _ var . Thus we can apply local Lipschitz-continuity of the Ito-Lyons map (see 
[5J Theorem 10.26]) to see that there is a random variable C3 such that 



y(fe) _ Y 



< Q 



q—var; [0.1] 

holds for every k G N outside M. Finally. 



3P q -var (s N (X^),S N (X)) < C 3 • C 2 el 



Y (k) _ Yt 



< 



< 



;[0,t] 



y(fc) _ Y 



•;[o,i] 



is true for all t £ [0, 1] and the claim follows. 



□ 
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6.1. Mollifier approximations. Let ^ be a mollifier function with support [—1,1], i.e. 
Co° ([ — 1, 1]) is positive and \4>\ L i = 1. Ifx: [0, 1] — > K is a continuous path, we denote by x: I 
its continuous extension to the whole real line, i.e. 

xq for x € (-co, 0] 
x u for rr. € [0, 1] 
&x for x € [1, oo) 

For e > set 



-<f>(u/e) and 
e (f — u) x u du. 



e (u) 



Let (efe) fcgN be a sequence of real numbers such that e& — > for fc — > oo. Define 

A k (x) := x ek . 

In [S], Chapter 15.2.3 it is shown that the sequence (Afe) fcgN fulfils the conditions of Theorcmjs] 

Corollary 7. Let X be as in Theorerr^^ and assume that there is a constant C such thatV p (^Rx] [s,i] 2 I 
C \t — s^ p holds for all s < t. Choose (efc)fceN €E ^ J Z 7 " and set X^ = X ek . Then the solutions 

r>l 



y( fc ) of the SDE (6.2) converge pathwise to the solution Y of (6.1) in the sense of (6.3) with rate 
O (el) where r\ is chosen as in Theorem^ 

Proof. It suffices to note that for every e>0,Ze{l 1 ,..., X d } and t e [0, 1] we have 



E 



\ZI-ZA 



= E 



= E 



<j) t (t - u) (Z u - Z t ) du 



[t-e,t+e] 



cj) e (t - u) (Z u - Z t ) du 



E 



4> t (t - u) 4> c (t - v) [Z u - Z t ) (Z v - Z t ) dudv 

t-e.t+c.] 2 

</> c (t - u) 4> e (t-v)E [(Z u - Z t ) (Z v - Z t )] du dv 

[t-e,t+e] 2 

: sup \E [(Z t+hl - Z t ) (Z t+h2 - Z t )] | 
te[o,i] 

|/ii|,|ha|<e 



< sup E 

te[o,i] 

\h\<e 



[Zt+h — Z t ) 



<c,e 1 'p 



from which follows that sup 0<t<1 \Xl k — Xt\ L i < c\e]/ p '. We conclude with Theorem^ □ 
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6.2. Piecewise linear approximations. If D = {0 = to < t\ < . . . < <#d-i = 1} is a partition 
of [0, 1] and x: [0, 1] -> R a continuous path, we denote by x D the piecewise linear approximation 
of x at the points of D, i.e. x D coincides with x at the points ti and if tj < t < ti+i we have 



t 'i+1 ~ t H+l ~~ H 

Let (Dk) keN be a sequence of partitions of [0, 1] such that \D k \ := ma,x ti eD k ~ h\} — > for 

k —> co. If a;: [0, 1] — > M is continuous, we define 



D k 



A fc (x) := x 

In Chapter 15.2.3] it is shown that (Ak) keN fulfils the conditions of Theorem [5j If Rx is the 
covariance of a Gaussian process, we set 



\ d \r x ,p = (™£$ V p ( R x'i M*+i] 2 )) 



Corollary 8. Let X be as in Theorem^ Choose a sequence of partitions (-Dfc) fcgN °f the interval 
[0,1] such that (\D k \ Rx:P ) € |J Z r and set x(k) = X ° k ■ Then the solutions Y^ of the SDE 



fcGN 



(6.2) converge pathwise to the solution Y of (6.1 1 in the sense of (6.3) with rate O (e^) where 



( £ fc)fc S pj = (j-Ofc|_R, x p ^ a^rf « is chosen as in Theorem 



Proof. Let I? be any partition of [0, 1] and t G [ti, U+i] where ti, ti+i € D. Take Z G j^ 1 , ■ • • , 
Then 



~ %t — Zu,ti+i 7 7 ^U,t- 

H+l ~ H 

Therefore 

\Zf - Z t \ L2 < \Z UM+1 \ L2 + \Z t J L2 < 2V p (i? x ;[t 4 ,t i+1 ] 2 ) 1/2 < 2\D\% Xtp . 

We conclude with Theorem [H □ 

Example 1. Let X — B H be the fractional Brownian motion with Hurst parameter H € (1/4, 1/2]. 
Set p — < 2. TTien one can show that Rx has finite p-variation and V p (^Rx',[s,t] 2 ^j < 

c{H)\t~s\ 1/p for all (s,t) G A (see [10], Example 1). A ssume that the vector fields in (|6.1|) 
are sufficiently smooth by which we mean that 1/p— 1/2 < 1/ (2p) — 1/9, i.e. 

2p 1 

9 > = — . 

~ p-1 1/2- H 

Let (-Dfc) feeN be the sequence of uniform partitions. By Corollary^ for every r\ < 2H — 1/2 there 
is a random variable C such that 

< c (r 



y(fc) _ Y 



a.s. 



hence we have a Wong-Zakai convergence rate arbitrary close to 2H — 1/2. Ln particular, for the 
Brownian motion, we obtain a rate close to 1/2, see also |llj and [7]. For H — > 1/4, the convergence 
rate tends to which reflects the fact that the Levy area indeed diverges for H = 1/4, see [3J. 
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6.3. The simplified step- TV Euler scheme. Consider again the SDE 

dY t = V(Y t )dX t , Y eR n 

interpreted as a pathwise RDE driven by the lift X of a Gaussian process X which fulfils the 
conditions of Theorem [I] Let D be a partition of [0,1]. We recall the simplified step- TV Euler 



scheme from the introduction: 

sEuler N ;D 



Y, 



o 



*0 



T^sEuler^jD 



Y, 



sEulor" ;D 



Vi Y 



^sEulcr™ ;D 



M. vl y, 



^sEulcr™ ;D 



^-j Vjj . . . Vi N _ 1 Vi N 



X 

f ^sEuler™ ;D\ yii Y^N 



)x^ 1 x* 2 
tj,tj +1 tj,tj + i 



where tj € D. In this section, we will investigate the convergence rate of this scheme. For simplicity, 
we will assume that 



(Rx;[s,tf 



0[\t -8 



1/p 



which can always be achieved at the price of a deterministic time-change based on 



[0, 1] 3 t i-> 



r P (Rx;[0,tf 
p (i?x;[0,i] : 



G [0,1]. 



Set D k = {i:i = 0,...,k}. 

Corollary 9. Let p > 2p and assume that \V\ Lip e < oo for 9 > p. Choose rj and N such that 

(111 11 

X] < mm < , > 

Then there are random variables C\ and C'2 such that 



and N < [9] . 



max 



Y t ■ - y; 

3 *j 



s Euler™ ;D k 



+ c 2 



a.s. for all fceN. 



Proof. Recall the step- TV Euler scheme from the introduction (or cf. [21 Chapter 10]). Set X^ 



X D " and let Y^ be the solution of the SDE Then Y t 

tj 6 Dk and therefore, using the triangle inequality, 

^sEulcr^Dfc 



sEulcr™ ;D k 



Euler™ ;£>,. 



for every 



max 



Y, - YT. 



D k 


< sup 


Y t - Y t {k) 


+ max 






*e[o,i] 




t 3 £D k 





By the choice of Df. we have \Du\ Rx p = O (k 1 ). Applying Corollary [5] we obtain for the first term 



\y — y( fe ) I 



O (k ''). Refering to [9j Theorem 10.30] we see that the second term is of order 



□ 



Remark 4. Assume that the vector fields are sufficiently smooth, i.e. 9 > ^zy- Then we obtain 
an error of O (k^^/P^ 1 / 2 ^ + O J' anyp>2p. That means that in the case p — 1, the 

step-2 scheme (i.e. the simplified Milstein scheme) gives an optimal convergence rate of (almost) 
1/2. For p € (1,2), the step-3 scheme gives an optimal rate of (almost) 1/p — 1/2. In particular, 
we see that using higher order schemes does not improve the convergence rate since in that case, the 
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Wong-Zakai error persists. In the fractional Brownian motion case, the simplified Milstein scheme 
gives an optimal convergence rate of (almost) 1/2 for the Brownian motion and for H £ (1/4, 1/2) 
the step-3 scheme gives an optimal rate of (almost) 2H — 1/2. This answers a conjecture stated in 
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